You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by namanmishra91 <gi...@git.apache.org> on 2017/10/11 05:58:20 UTC

[GitHub] zeppelin pull request #2619: Run all paragraphs sequentially

GitHub user namanmishra91 opened a pull request:

    https://github.com/apache/zeppelin/pull/2619

    Run all paragraphs sequentially

    ### What is this PR for?
    This PR introduces "run all" behaviour for a note in a sequential manner. The "Run All paragraphs" button will now run the notebook sequentially. There is a UX change as part of this PR, which is discussed below. Any suggestions to make it better are welcome.
    Following is the approach taken in brief:
    - Replace the behaviour of "Run All paragraphs" to "Run All sequentially"
    - Submit paragraphs to the respective scheduler only after the previous one completes execution (via wait-notify mechanism)
    - Communicate the status of the sequential run to client over websocket
    - Since the paragraphs won't be submitted immediately, changing their status is a challenge while keeping the code maintaining it at one place. Therefore, the approach I have taken is to keep the status of the paragrahs as-is and disallow explicit run of individual paragraphs when sequential run is in progress (see GIF for details).
    
    ### What type of PR is it?
    Improvement
    
    ### Todos
    
    ### What is the Jira issue?
    https://issues.apache.org/jira/browse/ZEPPELIN-2368
    
    ### How should this be tested?
    * Have a notebook with different interpreter types (e.g. sh, md, spark etc.)
    * Click "Run All paragraphs"
    * The paragraphs should run sequentially
    * While sequential run is in progress, explicit run of paragraphs should not be allowed
    * Correct sequential run state should be communicated to the browser when note is loaded
    
    ### Screenshots (if appropriate)
    ![sequential_run_all_1](https://user-images.githubusercontent.com/6438072/31424317-262d237a-ae77-11e7-81fe-8eda0d14c01c.gif)
    
    ![sequential_run_all_2](https://user-images.githubusercontent.com/6438072/31424322-29b4a23e-ae77-11e7-80dc-b6b9d84d2624.gif)
    
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? Probably not, but the UX would be different
    * Does this needs documentation? Yes


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/namanmishra91/zeppelin ZEPPELIN-2368

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/zeppelin/pull/2619.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2619
    
----
commit 4a6f9d19b5ba846bf8f205ec57a1649f5416a0ce
Author: Naman Mishra <na...@gmail.com>
Date:   2017-07-28T11:07:57Z

    Backend changes for sequential run of paragraphs

commit 5fc5c8390bf6c43009746b415334327ae2382c43
Author: Naman Mishra <na...@gmail.com>
Date:   2017-10-10T09:50:28Z

    Backend + UI changes

commit fdd1baeb10cc856ad3928cb7a35116172c14ea84
Author: Naman Mishra <na...@gmail.com>
Date:   2017-07-28T11:07:57Z

    Backend changes for sequential run of paragraphs

commit 8ef73408ad4addee1979ffc52fe3bdbd2fc93bd6
Author: Naman Mishra <na...@gmail.com>
Date:   2017-10-10T09:50:28Z

    Backend + UI changes

commit a51a3e8fd76d9293f8479e085b002352e2f3ac99
Author: Naman Mishra <na...@gmail.com>
Date:   2017-10-10T10:01:00Z

    Merge branch 'ZEPPELIN-2368' of https://github.com/namanmishra91/zeppelin into ZEPPELIN-2368

commit 63e7e521ca82d259278e40c88764df67b1c870ff
Author: Naman Mishra <na...@gmail.com>
Date:   2017-10-10T10:15:25Z

    Clean up logs

----


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    @namanmishra91 I don't understand why it would invoke frontend. Backend know more context about paragraph status and backend could control the workflow of paragraph running easily. 


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by namanmishra91 <gi...@git.apache.org>.
Github user namanmishra91 commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    @zjffdu Yeah, I can change the implementation to make `persistAndExecuteSingleParagraph` paragraph blocking. Then we won't need UI side changes for separate event handling etc. However, we will still need other UI changes for paragraph state management because the execution workflow is changing. Just to make sure that we are on the same page, let me take a moment to explain why those will be needed:
    
    Unless we want to change the zeppelin architecture of separate interpreter queues, we need to hold paragraphs to prevent them from being executed. Currently all paragraphs get submitted immediately and their status changes to PENDING. With the above approach, only one paragraph will be in any interpreter's queue at any given point of time; hence the status of _yet-to-be executed_ paragraphs will not get updated and the communication to the user that those paragraphs will run eventually will be lost. This will be a major change in the UX. We need to discuss and think about what should be the best way to handle this but as a preliminary implementation, we need to prevent explicit runs of the paragraphs that are not running yet. This will require maintaining state about whether a _run all_ is in progress and passing this to front-end.
    
    Let me know your thoughts on the above.


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    @namanmishra91 I made another PR #2627  which only change the backend code. Let me know whether this meet your requirement. 


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by namanmishra91 <gi...@git.apache.org>.
Github user namanmishra91 commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    Closed as PR 2627 addresses the same thing


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by tinkoff-dwh <gi...@git.apache.org>.
Github user tinkoff-dwh commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    https://github.com/apache/zeppelin/pull/2627 merged
    
    maybe close this PR?


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by namanmishra91 <gi...@git.apache.org>.
Github user namanmishra91 commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    @zjffdu 
    
    > What we need to do it to run it blocking way.
    
    The  code does exactly that. It might look complicated overall as a lot of code is related to handling and passing of sequential run state to UI. For backend changes, please check `runAllParagraphsSequentially()` in `NotebookServer.java`. 
    
    In a gist, sequential run is ultimately performed by `SequentialNoteRunner.java` which basically runs a paragraph, blocks until it is complete and then submits the next one. If it helps, I can divide this into 2 PRs : one for backend only changes and another for integration with frontend.


---

[GitHub] zeppelin pull request #2619: Run all paragraphs sequentially

Posted by namanmishra91 <gi...@git.apache.org>.
Github user namanmishra91 closed the pull request at:

    https://github.com/apache/zeppelin/pull/2619


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    @namanmishra91  I don't mean your PR doesn't do it correctly. I mean you could do it in an easier approach with less code changes. Just modifying `persistAndExecuteSingleParagraph` to make it run paragraph in a blocking way should be sufficient.  


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    @namanmishra91 Thanks for the contribution. But the implementation is a little complicated to me. I think the easiest implementation is just run paragraphs sequentially in backend. Look at the following code.
    https://github.com/apache/zeppelin/blob/master/zeppelin-server/src/main/java/org/apache/zeppelin/socket/NotebookServer.java#L1681
    `persistAndExecuteSingleParagraph` run paragraph in non-blocking way. What we need to do it to run it blocking way. 


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    @namanmishra91 I don't see the current behavior is retained in this PR. Personally I don't think the current behavior needs to be retained.  Even we want to keep the current behavior, it should be easy to make `persistAndExecuteSingleParagraph` to be both blocking and non-blocking (Just add one flag). I don't think it is necessary to invoke any frontend change, backend change should be sufficient. 


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by zjffdu <gi...@git.apache.org>.
Github user zjffdu commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    ping @namanmishra91 


---

[GitHub] zeppelin issue #2619: Run all paragraphs sequentially

Posted by namanmishra91 <gi...@git.apache.org>.
Github user namanmishra91 commented on the issue:

    https://github.com/apache/zeppelin/pull/2619
  
    @zjffdu Thanks for the feedback. Directly modifying the `persistAndExecuteSingleParagraph` method will lead to the current behaviour getting lost completely. The reason I chose to not overwrite the existing code flow was that there were suggestions on the JIRA that we should retain the current (concurrent) behaviour as well. Hence, added a new code path in backend as well as UI.


---