You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by PramodSSImmaneni <gi...@git.apache.org> on 2015/12/11 15:20:51 UTC

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

GitHub user PramodSSImmaneni opened a pull request:

    https://github.com/apache/incubator-apex-core/pull/187

    Checkpoint notification to notify operators before checkpoint is performed

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/PramodSSImmaneni/incubator-apex-core APEX-78

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-apex-core/pull/187.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #187
    
----
commit 99aa594160778be963999be2cf5d703945b06d2c
Author: Pramod Immaneni <pr...@datatorrent.com>
Date:   2015-12-11T14:03:29Z

    APEX-78 #comment Checkpoint notification to notify operators before checkpoint is performed

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by gauravgopi123 <gi...@git.apache.org>.
Github user gauravgopi123 commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47708448
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    No need for `public`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chinmaykolhatkar <gi...@git.apache.org>.
Github user chinmaykolhatkar commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r48931576
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    I think same would apply for checkpointed call also. Is that right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-169831429
  
    @tweise made the change and squashed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-170121776
  
    I rebased it but I am having some second thoughts about a different interface based on the japicmp discussion that @chandnisingh sent earlier.
    
    What do you guys think about relaxing the backwards compatibility restriction in the following way in cases where an "optional" callback needs to be added. We ensure that an app compiled against an older version still *runs* by doing runtime checks or with error handling even though the app will *not compile* against the new version without changes. When we are going to a new major version then we can remove those runtime checks.
    
    This may be a scenario we might see a few times in future and using a different interface each time might just make this a little problematic. I was planning to look at this over the weekend and in course see if I would need to implement a small utility for the runtime checks that can be reused in the future.
    
    Let me know if you think we should be having an email discussion on apex-dev about this rather than this JIRA. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by gauravgopi123 <gi...@git.apache.org>.
Github user gauravgopi123 commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47708357
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    --- End diff --
    
    No need for 
    ```java
     public static
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by gauravgopi123 <gi...@git.apache.org>.
Github user gauravgopi123 commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r48623987
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    void prepareCheckpoint(long windowId);
    --- End diff --
    
    @PramodSSImmaneni : The name suggests that every Stateless Operator needs to implement this which is not true. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47979384
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    @PramodSSImmaneni I think window id is important here for consistency with Checkpointed callback as well.  With checkpointing an operator's state is saved with respect to window id. The operator is informed before checkpointing state in reference to a window id and when it is done, that checkpointed callback is triggered. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chinmaykolhatkar <gi...@git.apache.org>.
Github user chinmaykolhatkar commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r48694586
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    +      ((Operator.CheckpointNotificationListener)operator).beforeCheckpoint(windowId);
    --- End diff --
    
    Should this be caught for Throwable?
    In case the beforeCheckpoint throws exception, the states which are meant to be saved won't be.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by gauravgopi123 <gi...@git.apache.org>.
Github user gauravgopi123 commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r48933859
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -224,6 +224,7 @@ public String toString()
        * Operators must implement this interface if they are interested in being notified as
        * soon as the operator state is checkpointed or committed.
        *
    +   * @deprecated Use {@link CheckpointNotificationListener} instead
    --- End diff --
    
    Why is CheckpointListener deprecated???


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47976721
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    @chandnisingh good point. However can you think of a use case where you would need to identify more than one set of state slices between checkpoints distinctly at the same time. If that is not the case and the old slices are cleaned up then we can reuse the same windowId from beginWindow to identify all of them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-166001035
  
    Changed method name but kept the windowId. Look at use case brought up by @chandnisingh 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r48074590
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    @chandnisingh Your example of application window > streaming window along with incremental checkpointing is a very good example to have it in the argument.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-169249166
  
    Please squash once remaining comments are addressed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r49036338
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    I see your point Thomas but as chinmay pointed out shouldn't both this and checkpointed callback not be called for stateless. Gaurav if you use this call in stateless operator to persist your operator state in your own way there is no inverse callback from the platform to recover that state.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-169805524
  
    Just stumbled upon this so wanted to share:
    https://github.com/siom79/japicmp/issues/59
    
    Adding a method to an interface breaks source compatibility but not binary compatibility.
    
    This is just for information. I don't have any issues with this pull request.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by gauravgopi123 <gi...@git.apache.org>.
Github user gauravgopi123 commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47708182
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    @PramodSSImmaneni : Why do you need windowId? Is it not getting called in this order beginwindow(x)->endWindow()->checkpoint()?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47982690
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    IMO it is needed with checkpoint as well. It signifies that state is going to be persisted with respect to a particular window Id. 
    As mentioned  before this window id is not the same as what the operator caches in beginWindow(...) when checkpointing is happen between application windows.
    One use case is checkpoint being done incrementally.
    It is consistent with checkpointed(windowId) callback.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r48931445
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    This should move into the if block so it does not execute for stateless operator.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r49036597
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    +      ((Operator.CheckpointNotificationListener)operator).beforeCheckpoint(windowId);
    --- End diff --
    
    The checkpoint call is called from within the Node lifecycle run method which has a global handling for Throwable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by ilooner <gi...@git.apache.org>.
Github user ilooner commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r49119498
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    +1 for Pramod's and Chandni's point. If you need custom checkpointing you use a different storage agent just like people have done with in memory checkpointing. If it is not easy or fast to use a Storage agent to do this then we should think about how to improve the Storage API, not introduce a hacky way of checkpointing. If an operator is Stateless that means it has no state.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-170126315
  
    The issue is resolved. This would make for a good standalone discussion on dev@


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47975611
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    There will be cases where it is needed to slice the state by checkpoint for processing in committed, but then this is the same situation as with endWindow and we should keep it consistent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-163946636
  
    @tweise please see


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-163998455
  
    Much needed change :-)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47975529
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    @PramodSSImmaneni When checkpointing is done between an application window (application window > 1), then beginWindow is not called for every streaming window. So don't we need windowId here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47977129
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    @PramodSSImmaneni Incremental checkpointing is an example



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47981967
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    checkpointed is not guaranteed to be called before the next window starts. That's why the windowId is passed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by gauravgopi123 <gi...@git.apache.org>.
Github user gauravgopi123 commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r48931757
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    I think this call should be outside that loop. This call is mainly to manage your transient variables just before check pointing which stateless operator will have too


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47975319
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    The reason I was passing the window id to the checkpoint call was for convenience if the operator needs that information for any reason (of course you can save it in beginWindow as well). Can anyone think of a use case/scenario where the window id might be needed in this call.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r49135114
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    @PramodSSImmaneni looks like majority is in favor of making the change. Can you please squash the commits along with it?  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-169862940
  
    Looks good, can you please also do a rebase:
    ```
    git rebase apache/devel-3
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47977582
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    @PramodSSImmaneni Sorry couldn't complete the comment. Incremental checkpointing may require windowId  because it is the operator which is writing the difference in state corresponding to a window id. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by PramodSSImmaneni <gi...@git.apache.org>.
Github user PramodSSImmaneni commented on the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#issuecomment-168287768
  
    Updated name


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by gauravgopi123 <gi...@git.apache.org>.
Github user gauravgopi123 commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r49016695
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    I think this call should be made irrespective of whether the operator is stateless or not. Here is the case I am thinking of, If I want to serialize the operator state in my own way I should be able to do in call. I can mark all my properties transient and use this call to serialize my state. I can do using storage agent too but this could be easier and faster way.. 
    
    Just a thought


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-apex-core/pull/187


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r49022209
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    Good point. Then we should perhaps make this explicit in the java doc?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r49117385
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    I agree with @PramodSSImmaneni 
    By definition a Stateless operator is an operator which doesn't have a state to persist which is irrespective of whether the operator developer wants the out-of-box state persistence by platform or use any other custom way of persistence. But if the operator developer marks it @Stateless that indicates that there isn't any state. 
    Being Stateless doesn't mean that operator will handle persisting state in a custom way. For that we have a pluggable storage agent API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r48981018
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -224,6 +224,7 @@ public String toString()
        * Operators must implement this interface if they are interested in being notified as
        * soon as the operator state is checkpointed or committed.
        *
    +   * @deprecated Use {@link CheckpointNotificationListener} instead
    --- End diff --
    
    We added an additional method and had to introduce the new interface instead of modifying the existing one for backward compatibility.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by gauravgopi123 <gi...@git.apache.org>.
Github user gauravgopi123 commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r49039537
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    It is the operator developers responsibility in that case to restore the state


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by gauravgopi123 <gi...@git.apache.org>.
Github user gauravgopi123 commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r47708784
  
    --- Diff: api/src/main/java/com/datatorrent/api/Operator.java ---
    @@ -270,4 +271,29 @@ public String toString()
     
       }
     
    +  /**
    +   * Operators that need to be notified about checkpoint events should implement this interface.
    +   *
    +   * The notification callbacks in this interface are called outside window boundaries so the operators should not
    +   * attempt to send any tuples in these callbacks.
    +   *
    +   */
    +  public static interface CheckpointNotificationListener extends CheckpointListener
    +  {
    +    /**
    +     * Notify the operator that a checkpoint is about to be performed.
    +     *
    +     * Operators may need to perform certain tasks just before a checkpoint such as calling flush on a stream. Having
    +     * this notification helps operators perform such operations optimally by doing them once before checkpoint as
    +     * opposed to doing it repeatedly at the end of every window.
    +     *
    +     * The method will be called before the checkpoint is performed. It will be called after
    +     * {@link Operator#endWindow()} call of the window preceding the checkpoint and before the checkpoint is
    +     * actually performed.
    +     *
    +     * @param windowId The window id of the window preceding the checkpoint
    +     */
    +    public void checkpoint(long windowId);
    --- End diff --
    
    Should it be called `preCheckpoint()`? as this is called before actual checkpoint happens?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-core pull request: Checkpoint notification to notif...

Posted by tweise <gi...@git.apache.org>.
Github user tweise commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-core/pull/187#discussion_r48981189
  
    --- Diff: engine/src/main/java/com/datatorrent/stram/engine/Node.java ---
    @@ -485,6 +485,10 @@ protected void deactivateSinks()
     
       void checkpoint(long windowId)
       {
    +    if (operator instanceof Operator.CheckpointNotificationListener) {
    --- End diff --
    
    The call is to notify of the imminent checkpoint. If there is no checkpoint, then why call the listener. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---