You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by GitBox <gi...@apache.org> on 2021/10/12 09:35:32 UTC

[GitHub] [zeppelin] Reamer opened a new pull request #4252: [ZEPPELIN-5559] Add note cache

Reamer opened a new pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252


   ### What is this PR for?
   This pull request adds a note cache that holds only a certain number of notes in memory.
   
   To keep the cache clean and consistent across multiple concurrent save actions, I rewrote the mechanism for saving notes.
   
   ### What type of PR is it?
   - Bug Fix
   
   ### What is the Jira issue?
   * https://issues.apache.org/jira/browse/ZEPPELIN-5559
   
   ### How should this be tested?
   * CI
   
   ### Questions:
   * Does the licenses files need update? No
   * Is there breaking changes for older versions? No
   * Does this needs documentation? Yes, a new configuration option has been added
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r780673496



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -561,69 +569,79 @@ public String toString() {
    * This class has 2 usage scenarios:
    * 1. metadata of note (only noteId and note name is loaded via reading the file name)
    * 2. the note object (note content is loaded from NotebookRepo)
-   *
+   * <br>
    * It will load note from NotebookRepo lazily until method getNote is called.
+   * A NoteCache ensures to free up resources, because its size is limited.
    */
   public static class NoteNode {
 
     private Folder parent;
-    private Note note;
+    private NoteInfo noteinfo;

Review comment:
       noteinfo -> noteInfo ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r751570244



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -651,5 +667,63 @@ public void updateNotePath() {
     }
   }
 
+  /**
+   * Unloads notes when reasonable amount of notes are in loaded state.
+   * Leverage a simple LRU cache for determing evictable notes. Ensure to
+   * not evict notes during save operation (dirty).

Review comment:
       I have changed the implementation to a read/write lock. Notes that have this read or write lock are not evicted.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-958761898


   @zjffdu 
   I have had very good experience with this feature in my environment and would like to include it in the master so that further development can be based on it.
   Since I have been using this feature, Zeppelin does not run in any OOM when I look at all the notes via script.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r738927892



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -651,5 +667,63 @@ public void updateNotePath() {
     }
   }
 
+  /**
+   * Unloads notes when reasonable amount of notes are in loaded state.
+   * Leverage a simple LRU cache for determing evictable notes. Ensure to
+   * not evict notes during save operation (dirty).

Review comment:
       Is it enough only not evict notes during save(write) operation ? The eviction may also affect read operation I am afraid (such as broadcast note to frontend). Here's one ticket which is caused by concurrent read/write operation. https://issues.apache.org/jira/browse/ZEPPELIN-5574




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-971893898


   @zjffdu 
   I have changed the implementation to Read/Write-Lock and the CI is also successful. A new review would be nice.
   Sorry for this big change, but protecting the Note object with a read/write lock requires big changes in several classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-1014754014


   I have adjusted the code based on your comments.
   
   > @Reamer Overall this approach LGTM, just some minor comments
   
   I have adjusted the code based on your comments. Thanks for your review.
   I will merge this PR with the master branch on Friday, as long as no further comments are received.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r736417406



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java
##########
@@ -391,13 +392,76 @@ public Note getNote(String noteId, boolean reload) throws IOException {
     return note;
   }
 
-  public void saveNote(Note note, AuthenticationInfo subject) throws IOException {
-    noteManager.saveNote(note, subject);
+  /**
+   * Save an existing note in an eviction aware fashion with only before save callback.
+   *
+   * @param nodeId noteId to run save operation on
+   * @param subject current subject
+   * @param beforeSave callback for applying an changes to the note before saving it to the notebook repo.
+   *
+   * @see #saveExistingNote(String, AuthenticationInfo, NoteSaveCallback, NoteSaveCallback)
+   */
+  public <T> T saveExistingNote(String noteId, AuthenticationInfo subject,
+      NoteBeforeSaveCallback<T> beforeSave) throws IOException {
+    return saveExistingNote(noteId, subject, beforeSave, null);
+  }
+
+  /**
+   * Save an existing note in an eviction aware fashion using callbacks around the save operations.
+   * When no note exists for given noteId, only beforeSave callback is processed.
+   *
+   * @param nodeId noteId to run save operation on
+   * @param subject current subject
+   * @param beforeSave callback for applying an changes to the note before saving it to the notebook repo.
+   * @param afterSave callback for applying an changes to the note after saving it to the notebook repo.
+   * @return the result of the beforeSave callback
+   */
+  public <T> T saveExistingNote(String noteId, AuthenticationInfo subject,
+      NoteBeforeSaveCallback<T> beforeSave, NoteAfterSaveCallback<T> afterSave) throws IOException {
+    final Note note = getNote(noteId, false);
+
+    if (note == null) { // callback contract: run beforeSave callback also when no note exists for given noteId

Review comment:
       To return a correct answer as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-5fca6cde02fb95c432560e3adfb7a7a20822a03cfcfb301c5262622dd8ac10abL353).
   Or to return the right exception as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-5fca6cde02fb95c432560e3adfb7a7a20822a03cfcfb301c5262622dd8ac10abL558).
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r742018916



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -651,5 +667,63 @@ public void updateNotePath() {
     }
   }
 
+  /**
+   * Unloads notes when reasonable amount of notes are in loaded state.
+   * Leverage a simple LRU cache for determing evictable notes. Ensure to
+   * not evict notes during save operation (dirty).

Review comment:
       Sorry for late response, @Reamer I will take a look at it again tomorrow




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-958761898


   @zjffdu 
   I have had very good experience with this feature in my environment and would like to include it in the master so that further development can be based on it.
   Since I have been using this feature, Zeppelin does not run in any OOM when I look at all the notes via script.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r780078820



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java
##########
@@ -352,43 +355,64 @@ public void removeCorruptedNote(String noteId, AuthenticationInfo subject) throw
     authorizationService.removeNoteAuth(noteId);
   }
 
+  /**
+   * Removes the Note with the NoteId.
+   * @param noteId
+   * @param subject
+   * @throws IOException when fail to get it from NotebookRepo
+   */
   public void removeNote(String noteId, AuthenticationInfo subject) throws IOException {
-    Note note = getNote(noteId);
-    removeNote(note, subject);
+    processNote(noteId,
+      note -> {
+        if (note != null) {
+          removeNote(note, subject);
+        }
+        return null;
+    });
   }
 
   /**
-   * Get note from NotebookRepo and also initialize it with other properties that is not
-   * persistent in NotebookRepo, such as paragraphJobListener.
+   * Process a note in an eviction aware manner. It initializes the note
+   * with other properties that is not persistent in NotebookRepo, such as paragraphJobListener.
+   * <p>
+   * Use {@link #processNote(String, boolean, NoteProcessor)} in case you want to force
+   * a note reload from the {@link #NotebookRepo}.
+   * </p>
    * @param noteId
-   * @return null if note not found.
-   * @throws IOException when fail to get it from NotebookRepo.
+   * @param noteProcessor
+   * @return result of the noteProcessor
+   * @throws IOException when fail to get it from {@link NotebookRepo}
    */
-  public Note getNote(String noteId) throws IOException {
-    return getNote(noteId, false);
+  public <T> T processNote(String noteId, NoteProcessor<T> noteProcessor) throws IOException {
+    return processNote(noteId, false, noteProcessor);
   }
 
   /**
-   * Get note from NotebookRepo and also initialize it with other properties that is not
-   * persistent in NotebookRepo, such as paragraphJobListener.
+   * Process a note in an eviction aware manner. It initializes the note
+   * with other properties that is not persistent in NotebookRepo, such as paragraphJobListener.
+   *
    * @param noteId
-   * @return null if note not found.
-   * @throws IOException when fail to get it from NotebookRepo.
+   * @param reload
+   * @param noteProcessor
+   * @return result of the noteProcessor
+   * @throws IOException when fail to get it from {@link NotebookRepo}
    */
-  public Note getNote(String noteId, boolean reload) throws IOException {
-    Note note = noteManager.getNote(noteId, reload);
-    if (note == null) {
-      return null;
-    }
-    note.setInterpreterFactory(replFactory);
-    note.setInterpreterSettingManager(interpreterSettingManager);
-    note.setParagraphJobListener(paragraphJobListener);
-    note.setNoteEventListeners(noteEventListeners);
-    note.setCredentials(credentials);
-    for (Paragraph p : note.getParagraphs()) {
-      p.setNote(note);
-    }
-    return note;
+  public <T> T processNote(String noteId, boolean reload, NoteProcessor<T> noteProcessor) throws IOException {
+    return noteManager.processNote(noteId, reload, note -> {
+      if (note == null) {
+        return noteProcessor.process(note);

Review comment:
       Yes, for several reasons.
   
   To return a correct response as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-017bf51cce0cfa2d227c2a93230a2164bbad6b98de581a54dc4a778b49704497R128).
   
   ```java
   if (note == null) {
     return new JsonResponse<>(Response.Status.NOT_FOUND, "Note " + noteId + " not found").build();
   }
   ```
   Or to work on a callback as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-94da03110f043520a11e2697bd4b6749134c14214ec240b65a61f6b0ed1f0005R66-R77)
   
   ```java
       return notebook.processNote(noteId,
         jobNote -> {
           List<NoteJobInfo> notesJobInfo = new ArrayList<>();
           if (jobNote == null) {
             callback.onFailure(new IOException("Note " + noteId + " not found"), context);
           } else {
             notesJobInfo.add(new NoteJobInfo(jobNote));
             callback.onSuccess(notesJobInfo, context);
           }
           return notesJobInfo;
         });
   ```
   
   for logging as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-c8d59c9816d0b26045cd0cd44343eb10dbebd61730fadae84a9b147e3b6226f8R1829-R1831)
   
   ```java
         getNotebook().processNote(noteId,
           note -> {
             if (note == null) {
               // It is possible the note is removed, but the job is still running
               LOG.warn("Note {} doesn't existed, it maybe deleted.", noteId);
             } else {
               note.clearParagraphOutput(paragraphId);
               Paragraph paragraph = note.getParagraph(paragraphId);
               broadcastParagraph(note, paragraph, MSG_ID_NOT_DEFINED);
             }
             return null;
           });
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] asfgit closed pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-958761898


   @zjffdu 
   I have had very good experience with this feature in my environment and would like to include it in the master so that further development can be based on it.
   Since I have been using this feature, Zeppelin does not run in any OOM when I look at all the notes via script.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-985305613


   I have found bugs in this implementation, let me correct these bugs before reviewing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-958761898


   @zjffdu 
   I have had very good experience with this feature in my environment and would like to include it in the master so that further development can be based on it.
   Since I have been using this feature, Zeppelin does not run in any OOM when I look at all the notes via script.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer edited a comment on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer edited a comment on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-1014754014


   > @Reamer Overall this approach LGTM, just some minor comments
   
   I have adjusted the code based on your comments. Thanks for your review.
   I will merge this PR with the master branch on Friday, as long as no further comments are received.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-996805233


   Thanks @Reamer I will try to find time to review this PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r780673424



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -561,69 +569,79 @@ public String toString() {
    * This class has 2 usage scenarios:
    * 1. metadata of note (only noteId and note name is loaded via reading the file name)
    * 2. the note object (note content is loaded from NotebookRepo)
-   *
+   * <br>
    * It will load note from NotebookRepo lazily until method getNote is called.
+   * A NoteCache ensures to free up resources, because its size is limited.
    */
   public static class NoteNode {
 
     private Folder parent;
-    private Note note;
+    private NoteInfo noteinfo;
     private NotebookRepo notebookRepo;
+    private NoteCache noteCache;
 
-    public NoteNode(Note note, Folder parent, NotebookRepo notebookRepo) {
-      this.note = note;
+    public NoteNode(NoteInfo noteinfo, Folder parent, NotebookRepo notebookRepo, NoteCache noteCache) {
+      this.noteinfo = noteinfo;
       this.parent = parent;
       this.notebookRepo = notebookRepo;
-    }
-
-    public synchronized Note getNote() throws IOException {
-        return getNote(false);
+      this.noteCache = noteCache;
     }
 
     /**
-     * This method will load note from NotebookRepo. If you just want to get noteId, noteName or
+     * This method will process note in a eviction aware manner by loading it from NotebookRepo.
+     *
+     * If you just want to get noteId, noteName or
      * notePath, you can call method getNoteId, getNoteName & getNotePath
-     * @return
+     *
+     * @param reload force a reload from {@link NotebookRepo}
+     * @param noteProcessor callback
+     * @return result of the noteProcessor
      * @throws IOException
      */
-    public synchronized Note getNote(boolean reload) throws IOException {
-      if (!note.isLoaded() || reload) {
-        note = notebookRepo.get(note.getId(), note.getPath(), AuthenticationInfo.ANONYMOUS);
-        if (parent.toString().equals("/")) {
-          note.setPath("/" + note.getName());
-        } else {
-          note.setPath(parent.toString() + "/" + note.getName());
+    public <T> T loadAndProcessNote(boolean reload, NoteProcessor<T> noteProcessor)
+        throws IOException {
+        // load note

Review comment:
       minor indention issue 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-1004626478


   Sorry, @Reamer I will find time to review it by the end of this week.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r736417406



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java
##########
@@ -391,13 +392,76 @@ public Note getNote(String noteId, boolean reload) throws IOException {
     return note;
   }
 
-  public void saveNote(Note note, AuthenticationInfo subject) throws IOException {
-    noteManager.saveNote(note, subject);
+  /**
+   * Save an existing note in an eviction aware fashion with only before save callback.
+   *
+   * @param nodeId noteId to run save operation on
+   * @param subject current subject
+   * @param beforeSave callback for applying an changes to the note before saving it to the notebook repo.
+   *
+   * @see #saveExistingNote(String, AuthenticationInfo, NoteSaveCallback, NoteSaveCallback)
+   */
+  public <T> T saveExistingNote(String noteId, AuthenticationInfo subject,
+      NoteBeforeSaveCallback<T> beforeSave) throws IOException {
+    return saveExistingNote(noteId, subject, beforeSave, null);
+  }
+
+  /**
+   * Save an existing note in an eviction aware fashion using callbacks around the save operations.
+   * When no note exists for given noteId, only beforeSave callback is processed.
+   *
+   * @param nodeId noteId to run save operation on
+   * @param subject current subject
+   * @param beforeSave callback for applying an changes to the note before saving it to the notebook repo.
+   * @param afterSave callback for applying an changes to the note after saving it to the notebook repo.
+   * @return the result of the beforeSave callback
+   */
+  public <T> T saveExistingNote(String noteId, AuthenticationInfo subject,
+      NoteBeforeSaveCallback<T> beforeSave, NoteAfterSaveCallback<T> afterSave) throws IOException {
+    final Note note = getNote(noteId, false);
+
+    if (note == null) { // callback contract: run beforeSave callback also when no note exists for given noteId

Review comment:
       To return a correct response as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-5fca6cde02fb95c432560e3adfb7a7a20822a03cfcfb301c5262622dd8ac10abL353).
   Or to return the right exception as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-5fca6cde02fb95c432560e3adfb7a7a20822a03cfcfb301c5262622dd8ac10abL558).
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r751568834



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java
##########
@@ -391,13 +392,76 @@ public Note getNote(String noteId, boolean reload) throws IOException {
     return note;
   }
 
-  public void saveNote(Note note, AuthenticationInfo subject) throws IOException {
-    noteManager.saveNote(note, subject);
+  /**
+   * Save an existing note in an eviction aware fashion with only before save callback.
+   *
+   * @param nodeId noteId to run save operation on
+   * @param subject current subject
+   * @param beforeSave callback for applying an changes to the note before saving it to the notebook repo.
+   *
+   * @see #saveExistingNote(String, AuthenticationInfo, NoteSaveCallback, NoteSaveCallback)
+   */
+  public <T> T saveExistingNote(String noteId, AuthenticationInfo subject,
+      NoteBeforeSaveCallback<T> beforeSave) throws IOException {
+    return saveExistingNote(noteId, subject, beforeSave, null);
+  }
+
+  /**
+   * Save an existing note in an eviction aware fashion using callbacks around the save operations.
+   * When no note exists for given noteId, only beforeSave callback is processed.
+   *
+   * @param nodeId noteId to run save operation on
+   * @param subject current subject
+   * @param beforeSave callback for applying an changes to the note before saving it to the notebook repo.
+   * @param afterSave callback for applying an changes to the note after saving it to the notebook repo.
+   * @return the result of the beforeSave callback
+   */
+  public <T> T saveExistingNote(String noteId, AuthenticationInfo subject,
+      NoteBeforeSaveCallback<T> beforeSave, NoteAfterSaveCallback<T> afterSave) throws IOException {
+    final Note note = getNote(noteId, false);
+
+    if (note == null) { // callback contract: run beforeSave callback also when no note exists for given noteId

Review comment:
       `NoteBeforeSaveCallback` and `NoteAfterSaveCallback` has been removed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r742585405



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -651,5 +667,63 @@ public void updateNotePath() {
     }
   }
 
+  /**
+   * Unloads notes when reasonable amount of notes are in loaded state.
+   * Leverage a simple LRU cache for determing evictable notes. Ensure to
+   * not evict notes during save operation (dirty).

Review comment:
       @Reamer I still think there are still potential issue here. Let's say the number of notes in LRUCache is equal to 50, and now there's one call of handleEviction which will evict one note (Note#unload is called). At the same time, there're 50 threads (each thread read one note) which all call NotebookServer#broadcastNote to deserialize note to frontend. So at least one of them will be affected (one note will be unload). I have to admin it is very unlikely unless very high concurrency. But in theory, this kind of case exists. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r742018916



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -651,5 +667,63 @@ public void updateNotePath() {
     }
   }
 
+  /**
+   * Unloads notes when reasonable amount of notes are in loaded state.
+   * Leverage a simple LRU cache for determing evictable notes. Ensure to
+   * not evict notes during save operation (dirty).

Review comment:
       Sorry for late response, @Reamer I will take a look at it again tomorrow

##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -651,5 +667,63 @@ public void updateNotePath() {
     }
   }
 
+  /**
+   * Unloads notes when reasonable amount of notes are in loaded state.
+   * Leverage a simple LRU cache for determing evictable notes. Ensure to
+   * not evict notes during save operation (dirty).

Review comment:
       Sorry for late response, @Reamer I will take a look at it again tomorrow




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-1014089244


   @Reamer Overall this approach LGTM, just some minor comments


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r742018916



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -651,5 +667,63 @@ public void updateNotePath() {
     }
   }
 
+  /**
+   * Unloads notes when reasonable amount of notes are in loaded state.
+   * Leverage a simple LRU cache for determing evictable notes. Ensure to
+   * not evict notes during save operation (dirty).

Review comment:
       Sorry for late response, @Reamer I will take a look at it again tomorrow




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r740131720



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -651,5 +667,63 @@ public void updateNotePath() {
     }
   }
 
+  /**
+   * Unloads notes when reasonable amount of notes are in loaded state.
+   * Leverage a simple LRU cache for determing evictable notes. Ensure to
+   * not evict notes during save operation (dirty).

Review comment:
       In principle you are right. However, `GetNote()` always places the note at the beginning of the LRUCache. `GetNote()` is always called before information is read from a note.
   As long as the concurrent executions are below the lru cache threshold, there should be no problems with concurrency.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-945877742


   A review would be nice.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r779985593



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java
##########
@@ -352,43 +355,64 @@ public void removeCorruptedNote(String noteId, AuthenticationInfo subject) throw
     authorizationService.removeNoteAuth(noteId);
   }
 
+  /**
+   * Removes the Note with the NoteId.
+   * @param noteId
+   * @param subject
+   * @throws IOException when fail to get it from NotebookRepo
+   */
   public void removeNote(String noteId, AuthenticationInfo subject) throws IOException {
-    Note note = getNote(noteId);
-    removeNote(note, subject);
+    processNote(noteId,
+      note -> {
+        if (note != null) {
+          removeNote(note, subject);
+        }
+        return null;
+    });
   }
 
   /**
-   * Get note from NotebookRepo and also initialize it with other properties that is not
-   * persistent in NotebookRepo, such as paragraphJobListener.
+   * Process a note in an eviction aware manner. It initializes the note
+   * with other properties that is not persistent in NotebookRepo, such as paragraphJobListener.
+   * <p>
+   * Use {@link #processNote(String, boolean, NoteProcessor)} in case you want to force
+   * a note reload from the {@link #NotebookRepo}.
+   * </p>
    * @param noteId
-   * @return null if note not found.
-   * @throws IOException when fail to get it from NotebookRepo.
+   * @param noteProcessor
+   * @return result of the noteProcessor
+   * @throws IOException when fail to get it from {@link NotebookRepo}
    */
-  public Note getNote(String noteId) throws IOException {
-    return getNote(noteId, false);
+  public <T> T processNote(String noteId, NoteProcessor<T> noteProcessor) throws IOException {
+    return processNote(noteId, false, noteProcessor);
   }
 
   /**
-   * Get note from NotebookRepo and also initialize it with other properties that is not
-   * persistent in NotebookRepo, such as paragraphJobListener.
+   * Process a note in an eviction aware manner. It initializes the note
+   * with other properties that is not persistent in NotebookRepo, such as paragraphJobListener.
+   *
    * @param noteId
-   * @return null if note not found.
-   * @throws IOException when fail to get it from NotebookRepo.
+   * @param reload
+   * @param noteProcessor
+   * @return result of the noteProcessor
+   * @throws IOException when fail to get it from {@link NotebookRepo}
    */
-  public Note getNote(String noteId, boolean reload) throws IOException {
-    Note note = noteManager.getNote(noteId, reload);
-    if (note == null) {
-      return null;
-    }
-    note.setInterpreterFactory(replFactory);
-    note.setInterpreterSettingManager(interpreterSettingManager);
-    note.setParagraphJobListener(paragraphJobListener);
-    note.setNoteEventListeners(noteEventListeners);
-    note.setCredentials(credentials);
-    for (Paragraph p : note.getParagraphs()) {
-      p.setNote(note);
-    }
-    return note;
+  public <T> T processNote(String noteId, boolean reload, NoteProcessor<T> noteProcessor) throws IOException {
+    return noteManager.processNote(noteId, reload, note -> {
+      if (note == null) {
+        return noteProcessor.process(note);

Review comment:
       Is it necessary to process a note when it is null?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r780078820



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java
##########
@@ -352,43 +355,64 @@ public void removeCorruptedNote(String noteId, AuthenticationInfo subject) throw
     authorizationService.removeNoteAuth(noteId);
   }
 
+  /**
+   * Removes the Note with the NoteId.
+   * @param noteId
+   * @param subject
+   * @throws IOException when fail to get it from NotebookRepo
+   */
   public void removeNote(String noteId, AuthenticationInfo subject) throws IOException {
-    Note note = getNote(noteId);
-    removeNote(note, subject);
+    processNote(noteId,
+      note -> {
+        if (note != null) {
+          removeNote(note, subject);
+        }
+        return null;
+    });
   }
 
   /**
-   * Get note from NotebookRepo and also initialize it with other properties that is not
-   * persistent in NotebookRepo, such as paragraphJobListener.
+   * Process a note in an eviction aware manner. It initializes the note
+   * with other properties that is not persistent in NotebookRepo, such as paragraphJobListener.
+   * <p>
+   * Use {@link #processNote(String, boolean, NoteProcessor)} in case you want to force
+   * a note reload from the {@link #NotebookRepo}.
+   * </p>
    * @param noteId
-   * @return null if note not found.
-   * @throws IOException when fail to get it from NotebookRepo.
+   * @param noteProcessor
+   * @return result of the noteProcessor
+   * @throws IOException when fail to get it from {@link NotebookRepo}
    */
-  public Note getNote(String noteId) throws IOException {
-    return getNote(noteId, false);
+  public <T> T processNote(String noteId, NoteProcessor<T> noteProcessor) throws IOException {
+    return processNote(noteId, false, noteProcessor);
   }
 
   /**
-   * Get note from NotebookRepo and also initialize it with other properties that is not
-   * persistent in NotebookRepo, such as paragraphJobListener.
+   * Process a note in an eviction aware manner. It initializes the note
+   * with other properties that is not persistent in NotebookRepo, such as paragraphJobListener.
+   *
    * @param noteId
-   * @return null if note not found.
-   * @throws IOException when fail to get it from NotebookRepo.
+   * @param reload
+   * @param noteProcessor
+   * @return result of the noteProcessor
+   * @throws IOException when fail to get it from {@link NotebookRepo}
    */
-  public Note getNote(String noteId, boolean reload) throws IOException {
-    Note note = noteManager.getNote(noteId, reload);
-    if (note == null) {
-      return null;
-    }
-    note.setInterpreterFactory(replFactory);
-    note.setInterpreterSettingManager(interpreterSettingManager);
-    note.setParagraphJobListener(paragraphJobListener);
-    note.setNoteEventListeners(noteEventListeners);
-    note.setCredentials(credentials);
-    for (Paragraph p : note.getParagraphs()) {
-      p.setNote(note);
-    }
-    return note;
+  public <T> T processNote(String noteId, boolean reload, NoteProcessor<T> noteProcessor) throws IOException {
+    return noteManager.processNote(noteId, reload, note -> {
+      if (note == null) {
+        return noteProcessor.process(note);

Review comment:
       Yes, for several reasons.
   
   To return a correct response as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-017bf51cce0cfa2d227c2a93230a2164bbad6b98de581a54dc4a778b49704497R128).
   
   ```java
   if (note == null) {
     return new JsonResponse<>(Response.Status.NOT_FOUND, "Note " + noteId + " not found").build();
   }
   ```
   Or to work on a callback as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-94da03110f043520a11e2697bd4b6749134c14214ec240b65a61f6b0ed1f0005R66-R77)
   
   ```java
       return notebook.processNote(noteId,
         jobNote -> {
           List<NoteJobInfo> notesJobInfo = new ArrayList<>();
           if (jobNote == null) {
             callback.onFailure(new IOException("Note " + noteId + " not found"), context);
           } else {
             notesJobInfo.add(new NoteJobInfo(jobNote));
             callback.onSuccess(notesJobInfo, context);
           }
           return notesJobInfo;
         });
   ```
   
   for logging as in this [case](https://github.com/apache/zeppelin/pull/4252/files#diff-c8d59c9816d0b26045cd0cd44343eb10dbebd61730fadae84a9b147e3b6226f8R1829-R1831)
   
   ```java
         getNotebook().processNote(noteId,
           note -> {
             if (note == null) {
               // It is possible the note is removed, but the job is still running
               LOG.warn("Note {} doesn't existed, it maybe deleted.", noteId);
             } else {
               note.clearParagraphOutput(paragraphId);
               Paragraph paragraph = note.getParagraph(paragraphId);
               broadcastParagraph(note, paragraph, MSG_ID_NOT_DEFINED);
             }
             return null;
           });
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r780675565



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -561,69 +569,79 @@ public String toString() {
    * This class has 2 usage scenarios:
    * 1. metadata of note (only noteId and note name is loaded via reading the file name)
    * 2. the note object (note content is loaded from NotebookRepo)
-   *
+   * <br>
    * It will load note from NotebookRepo lazily until method getNote is called.
+   * A NoteCache ensures to free up resources, because its size is limited.
    */
   public static class NoteNode {
 
     private Folder parent;
-    private Note note;
+    private NoteInfo noteinfo;
     private NotebookRepo notebookRepo;
+    private NoteCache noteCache;
 
-    public NoteNode(Note note, Folder parent, NotebookRepo notebookRepo) {
-      this.note = note;
+    public NoteNode(NoteInfo noteinfo, Folder parent, NotebookRepo notebookRepo, NoteCache noteCache) {
+      this.noteinfo = noteinfo;
       this.parent = parent;
       this.notebookRepo = notebookRepo;
-    }
-
-    public synchronized Note getNote() throws IOException {
-        return getNote(false);
+      this.noteCache = noteCache;
     }
 
     /**
-     * This method will load note from NotebookRepo. If you just want to get noteId, noteName or
+     * This method will process note in a eviction aware manner by loading it from NotebookRepo.
+     *
+     * If you just want to get noteId, noteName or
      * notePath, you can call method getNoteId, getNoteName & getNotePath
-     * @return
+     *
+     * @param reload force a reload from {@link NotebookRepo}
+     * @param noteProcessor callback
+     * @return result of the noteProcessor
      * @throws IOException
      */
-    public synchronized Note getNote(boolean reload) throws IOException {
-      if (!note.isLoaded() || reload) {
-        note = notebookRepo.get(note.getId(), note.getPath(), AuthenticationInfo.ANONYMOUS);
-        if (parent.toString().equals("/")) {
-          note.setPath("/" + note.getName());
-        } else {
-          note.setPath(parent.toString() + "/" + note.getName());
+    public <T> T loadAndProcessNote(boolean reload, NoteProcessor<T> noteProcessor)
+        throws IOException {
+        // load note
+      Note note;
+      synchronized (this) {
+        note = noteCache.getNote(noteinfo.getId());
+        if (note == null || reload) {
+          note = notebookRepo.get(noteinfo.getId(), noteinfo.getPath(), AuthenticationInfo.ANONYMOUS);
+          if (parent.toString().equals("/")) {
+            note.setPath("/" + note.getName());
+          } else {
+            note.setPath(parent.toString() + "/" + note.getName());
+          }
+          note.setCronSupported(ZeppelinConfiguration.create());
+          noteCache.putNote(note);
         }
-        note.setCronSupported(ZeppelinConfiguration.create());
-        note.setLoaded(true);
+        note.getLock().readLock().lock();

Review comment:
       Move it to the try block below seems more readable.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-996798016


   @zjffdu 
   I hope you find time between Christmas and New Year to look at the changes. I would like to add this PR to the master branch at the beginning of January. The PR drastically reduces Zeppelin's memory consumption.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu edited a comment on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu edited a comment on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-996805233


   Thanks @Reamer I will try to find time to review this PR,Happy Christmas!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#issuecomment-987908787


   > I have found bugs in this implementation, let me correct these bugs before reviewing.
   
   Bugs fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r736304342



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java
##########
@@ -391,13 +392,76 @@ public Note getNote(String noteId, boolean reload) throws IOException {
     return note;
   }
 
-  public void saveNote(Note note, AuthenticationInfo subject) throws IOException {
-    noteManager.saveNote(note, subject);
+  /**
+   * Save an existing note in an eviction aware fashion with only before save callback.
+   *
+   * @param nodeId noteId to run save operation on
+   * @param subject current subject
+   * @param beforeSave callback for applying an changes to the note before saving it to the notebook repo.
+   *
+   * @see #saveExistingNote(String, AuthenticationInfo, NoteSaveCallback, NoteSaveCallback)
+   */
+  public <T> T saveExistingNote(String noteId, AuthenticationInfo subject,
+      NoteBeforeSaveCallback<T> beforeSave) throws IOException {
+    return saveExistingNote(noteId, subject, beforeSave, null);
+  }
+
+  /**
+   * Save an existing note in an eviction aware fashion using callbacks around the save operations.
+   * When no note exists for given noteId, only beforeSave callback is processed.
+   *
+   * @param nodeId noteId to run save operation on
+   * @param subject current subject
+   * @param beforeSave callback for applying an changes to the note before saving it to the notebook repo.
+   * @param afterSave callback for applying an changes to the note after saving it to the notebook repo.
+   * @return the result of the beforeSave callback
+   */
+  public <T> T saveExistingNote(String noteId, AuthenticationInfo subject,
+      NoteBeforeSaveCallback<T> beforeSave, NoteAfterSaveCallback<T> afterSave) throws IOException {
+    final Note note = getNote(noteId, false);
+
+    if (note == null) { // callback contract: run beforeSave callback also when no note exists for given noteId

Review comment:
       why run beforeSave even no note exists ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] Reamer commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
Reamer commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r786185989



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -561,69 +569,79 @@ public String toString() {
    * This class has 2 usage scenarios:
    * 1. metadata of note (only noteId and note name is loaded via reading the file name)
    * 2. the note object (note content is loaded from NotebookRepo)
-   *
+   * <br>
    * It will load note from NotebookRepo lazily until method getNote is called.
+   * A NoteCache ensures to free up resources, because its size is limited.
    */
   public static class NoteNode {
 
     private Folder parent;
-    private Note note;
+    private NoteInfo noteinfo;

Review comment:
       adjusted

##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -561,69 +569,79 @@ public String toString() {
    * This class has 2 usage scenarios:
    * 1. metadata of note (only noteId and note name is loaded via reading the file name)
    * 2. the note object (note content is loaded from NotebookRepo)
-   *
+   * <br>
    * It will load note from NotebookRepo lazily until method getNote is called.
+   * A NoteCache ensures to free up resources, because its size is limited.
    */
   public static class NoteNode {
 
     private Folder parent;
-    private Note note;
+    private NoteInfo noteinfo;
     private NotebookRepo notebookRepo;
+    private NoteCache noteCache;
 
-    public NoteNode(Note note, Folder parent, NotebookRepo notebookRepo) {
-      this.note = note;
+    public NoteNode(NoteInfo noteinfo, Folder parent, NotebookRepo notebookRepo, NoteCache noteCache) {
+      this.noteinfo = noteinfo;
       this.parent = parent;
       this.notebookRepo = notebookRepo;
-    }
-
-    public synchronized Note getNote() throws IOException {
-        return getNote(false);
+      this.noteCache = noteCache;
     }
 
     /**
-     * This method will load note from NotebookRepo. If you just want to get noteId, noteName or
+     * This method will process note in a eviction aware manner by loading it from NotebookRepo.
+     *
+     * If you just want to get noteId, noteName or
      * notePath, you can call method getNoteId, getNoteName & getNotePath
-     * @return
+     *
+     * @param reload force a reload from {@link NotebookRepo}
+     * @param noteProcessor callback
+     * @return result of the noteProcessor
      * @throws IOException
      */
-    public synchronized Note getNote(boolean reload) throws IOException {
-      if (!note.isLoaded() || reload) {
-        note = notebookRepo.get(note.getId(), note.getPath(), AuthenticationInfo.ANONYMOUS);
-        if (parent.toString().equals("/")) {
-          note.setPath("/" + note.getName());
-        } else {
-          note.setPath(parent.toString() + "/" + note.getName());
+    public <T> T loadAndProcessNote(boolean reload, NoteProcessor<T> noteProcessor)
+        throws IOException {
+        // load note
+      Note note;
+      synchronized (this) {
+        note = noteCache.getNote(noteinfo.getId());
+        if (note == null || reload) {
+          note = notebookRepo.get(noteinfo.getId(), noteinfo.getPath(), AuthenticationInfo.ANONYMOUS);
+          if (parent.toString().equals("/")) {
+            note.setPath("/" + note.getName());
+          } else {
+            note.setPath(parent.toString() + "/" + note.getName());
+          }
+          note.setCronSupported(ZeppelinConfiguration.create());
+          noteCache.putNote(note);
         }
-        note.setCronSupported(ZeppelinConfiguration.create());
-        note.setLoaded(true);
+        note.getLock().readLock().lock();

Review comment:
       adjusted

##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java
##########
@@ -748,4 +766,20 @@ public Boolean isRevisionSupported() {
       return false;
     }
   }
+
+  /**
+   * Functional Interface for note processing.
+   */
+  @FunctionalInterface
+  public static interface NoteProcessor<T> {

Review comment:
       adjusted

##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/NoteManager.java
##########
@@ -561,69 +569,79 @@ public String toString() {
    * This class has 2 usage scenarios:
    * 1. metadata of note (only noteId and note name is loaded via reading the file name)
    * 2. the note object (note content is loaded from NotebookRepo)
-   *
+   * <br>
    * It will load note from NotebookRepo lazily until method getNote is called.
+   * A NoteCache ensures to free up resources, because its size is limited.
    */
   public static class NoteNode {
 
     private Folder parent;
-    private Note note;
+    private NoteInfo noteinfo;
     private NotebookRepo notebookRepo;
+    private NoteCache noteCache;
 
-    public NoteNode(Note note, Folder parent, NotebookRepo notebookRepo) {
-      this.note = note;
+    public NoteNode(NoteInfo noteinfo, Folder parent, NotebookRepo notebookRepo, NoteCache noteCache) {
+      this.noteinfo = noteinfo;
       this.parent = parent;
       this.notebookRepo = notebookRepo;
-    }
-
-    public synchronized Note getNote() throws IOException {
-        return getNote(false);
+      this.noteCache = noteCache;
     }
 
     /**
-     * This method will load note from NotebookRepo. If you just want to get noteId, noteName or
+     * This method will process note in a eviction aware manner by loading it from NotebookRepo.
+     *
+     * If you just want to get noteId, noteName or
      * notePath, you can call method getNoteId, getNoteName & getNotePath
-     * @return
+     *
+     * @param reload force a reload from {@link NotebookRepo}
+     * @param noteProcessor callback
+     * @return result of the noteProcessor
      * @throws IOException
      */
-    public synchronized Note getNote(boolean reload) throws IOException {
-      if (!note.isLoaded() || reload) {
-        note = notebookRepo.get(note.getId(), note.getPath(), AuthenticationInfo.ANONYMOUS);
-        if (parent.toString().equals("/")) {
-          note.setPath("/" + note.getName());
-        } else {
-          note.setPath(parent.toString() + "/" + note.getName());
+    public <T> T loadAndProcessNote(boolean reload, NoteProcessor<T> noteProcessor)
+        throws IOException {
+        // load note

Review comment:
       adjusted




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [zeppelin] zjffdu commented on a change in pull request #4252: [ZEPPELIN-5559] Add note cache

Posted by GitBox <gi...@apache.org>.
zjffdu commented on a change in pull request #4252:
URL: https://github.com/apache/zeppelin/pull/4252#discussion_r780675790



##########
File path: zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Notebook.java
##########
@@ -748,4 +766,20 @@ public Boolean isRevisionSupported() {
       return false;
     }
   }
+
+  /**
+   * Functional Interface for note processing.
+   */
+  @FunctionalInterface
+  public static interface NoteProcessor<T> {

Review comment:
       static is redundant




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org