You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/05/21 06:46:31 UTC

[GitHub] [pulsar] srkukarni opened a new pull request #7010: Fix null pointer when getting function instance metrics.

srkukarni opened a new pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010


   <!--
   ### Contribution Checklist
     
     - Name the pull request in the form "[Issue XYZ][component] Title of the pull request", where *XYZ* should be replaced by the actual issue number.
       Skip *Issue XYZ* if there is no associated github issue for this pull request.
       Skip *component* if you are unsure about which is the best component. E.g. `[docs] Fix typo in produce method`.
   
     - Fill out the template below to describe the changes contributed by the pull request. That will give reviewers the context they need to do the review.
     
     - Each pull request should address only one issue, not mix up code from multiple issues.
     
     - Each commit in the pull request has a meaningful commit message
   
     - Once all items of the checklist are addressed, remove the above text and this checklist, leaving only the filled out template below.
   
   **(The sections below can be removed for hotfixes of typos)**
   -->
   
   *(If this PR fixes a github issue, please add `Fixes #<xyz>`.)*
   
   Fixes #<xyz>
   
   *(or if this PR is one task of a github issue, please add `Master Issue: #<xyz>` to link to the master issue.)*
   
   Master Issue: #<xyz>
   
   ### Motivation
   
   stats is an internal variable inside JavaInstanceRunnable that shouldn't be needlessly exposed.
   This pr fixes its access(as well as a couple of other exposed variables which are never accessed outside the class).
   
   ### Modifications
   
   *Describe the modifications you've done.*
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
     - *Added integration tests for end-to-end deployment with large payloads (10MB)*
     - *Extended integration test for recovery after broker failure*
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If `yes` was chosen, please highlight the changes*
   
     - Dependencies (does it add or upgrade a dependency): (yes / no)
     - The public API: (yes / no)
     - The schema: (yes / no / don't know)
     - The default values of configurations: (yes / no)
     - The wire protocol: (yes / no)
     - The rest endpoints: (yes / no)
     - The admin cli options: (yes / no)
     - Anything that affects deployment: (yes / no / don't know)
   
   ### Documentation
   
     - Does this pull request introduce a new feature? (yes / no)
     - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
     - If a feature is not applicable for documentation, explain why?
     - If a feature is not documented yet in this PR, please create a followup issue for adding the documentation
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni commented on pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
srkukarni commented on pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#issuecomment-632344106


   To unblock this pr, I've removed stats changes. Lets discuss that in a separate issue


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni commented on a change in pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
srkukarni commented on a change in pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#discussion_r428732247



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java
##########
@@ -801,4 +810,12 @@ public void setupOutput(ContextImpl contextImpl) throws Exception {
             Thread.currentThread().setContextClassLoader(this.instanceClassLoader);
         }
     }
+
+    public String getStatsAsString() throws IOException {

Review comment:
       yes. Done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni commented on pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
srkukarni commented on pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#issuecomment-632290148


   @jerrypeng 
   1. Some methods are updating multiple counters/gauages. That still has to be in sync block to ensure that we get consistent stats
   2. The fact that today this class is using an already thread safe data structure is merely detail. If for whatever reason we switch it to some other impl, the methods should still be thread safe. Thus don;t you think its better to declare the methods synchronized explicitly?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#issuecomment-632331138


   @srkukarni 
   
   > Some methods are updating multiple counters/gauages. That still has to be in sync block to ensure that we get consistent stats.  
   
   Given a specific point of time, we cannot guarantee that all stats retrieved will be consistent regardless of whether we synchronize the methods or not.  Until less we update all stats at the end of each function invocation in an atomic fashion, there can also be slight deviations. And that is not accounting user stats that can be updated at any time and out of our control.  Adding synchronization on all methods may add overhead and contention that is not necessary.  I don't know of any system that attempts to update all metrics in an atomic fashion.
   
   > The fact that today this class is using an already thread safe data structure is merely detail. If for whatever reason we switch it to some other impl, the methods should still be thread safe. Thus don;t you think its better to declare the methods synchronized explicitly?
   
   Why would we consider using another implementation?  Are there any deficiencies in the current solution? The stats manager is not even an interface in which the user specify the implementation. I don't think it worthwhile to consider "what if" situations at this moment.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni commented on a change in pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
srkukarni commented on a change in pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#discussion_r428737674



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java
##########
@@ -801,4 +810,12 @@ public void setupOutput(ContextImpl contextImpl) throws Exception {
             Thread.currentThread().setContextClassLoader(this.instanceClassLoader);
         }
     }
+
+    public String getStatsAsString() throws IOException {

Review comment:
       There is a follow up change coming up that makes stats class threadsafe. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni commented on a change in pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
srkukarni commented on a change in pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#discussion_r428968508



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java
##########
@@ -546,13 +547,17 @@ synchronized public void close() {
         }
     }
 
-    public InstanceCommunication.MetricsData getAndResetMetrics() {
+    // This method is synchronized because it is using the stats variable
+    synchronized public InstanceCommunication.MetricsData getAndResetMetrics() {
         InstanceCommunication.MetricsData metricsData = getMetrics();
-        stats.reset();
+        if (stats != null) {
+            stats.reset();
+        }
         return metricsData;
     }
 
-    public InstanceCommunication.MetricsData getMetrics() {
+    // This method is synchronized because it is using the stats and javaInstance variables
+    synchronized public InstanceCommunication.MetricsData getMetrics() {

Review comment:
       Changed it




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on a change in pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on a change in pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#discussion_r428924499



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java
##########
@@ -546,13 +547,17 @@ synchronized public void close() {
         }
     }
 
-    public InstanceCommunication.MetricsData getAndResetMetrics() {
+    // This method is synchronized because it is using the stats variable
+    synchronized public InstanceCommunication.MetricsData getAndResetMetrics() {
         InstanceCommunication.MetricsData metricsData = getMetrics();
-        stats.reset();
+        if (stats != null) {
+            stats.reset();
+        }
         return metricsData;
     }
 
-    public InstanceCommunication.MetricsData getMetrics() {
+    // This method is synchronized because it is using the stats and javaInstance variables
+    synchronized public InstanceCommunication.MetricsData getMetrics() {

Review comment:
       Even though synchronized methods are re-entrant in Java, it still appears a little weird that a synchronized getMetrics() be called inside another synchronized method "getAndResetMetrics()".  Maybe have another "base" private method that gets called by both that is not synchronized?
   
     A side note, I think in the future, we might want to define more interfaces for "modules" in functions, so that it is more clear on what methods should be exposed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng edited a comment on pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
jerrypeng edited a comment on pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#issuecomment-632331138


   @srkukarni 
   
   > Some methods are updating multiple counters/gauages. That still has to be in sync block to ensure that we get consistent stats.  
   
   Given a specific point of time, we cannot guarantee that all stats retrieved will be consistent regardless of whether we synchronize the methods or not.  Until less we update all stats at the end of each function invocation in an atomic fashion, there can also be slight deviations. And that is not accounting user stats that can be updated at any time and out of our control.  Adding synchronization on all methods may add overhead and contention that is not necessary.  I don't know of any system that attempts to update all metrics in an atomic fashion.
   
   > The fact that today this class is using an already thread safe data structure is merely detail. If for whatever reason we switch it to some other impl, the methods should still be thread safe. Thus don;t you think its better to declare the methods synchronized explicitly?
   
   Why would we consider using another implementation?  Are there any deficiencies in the current solution? The stats manager is not even an interface in which the user specify the implementation. Since there is not contract with developers, we can change the implementation of how stats are handle whenever we feel like the implementation is not good enough. At that time we can synchronize methods if necessary.  I don't think there is a point to pre-empt something like this given we don't even know a new stats implementation would even look like.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni merged pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
srkukarni merged pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] srkukarni commented on a change in pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
srkukarni commented on a change in pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#discussion_r428851965



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java
##########
@@ -801,4 +810,12 @@ public void setupOutput(ContextImpl contextImpl) throws Exception {
             Thread.currentThread().setContextClassLoader(this.instanceClassLoader);
         }
     }
+
+    public String getStatsAsString() throws IOException {

Review comment:
       Actually I've made the stats class synchronized to ensure that it becomes threadsafe.
   Also have moved the setup syncronized, so that it and close won't be able to run together




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on a change in pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on a change in pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#discussion_r428924499



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java
##########
@@ -546,13 +547,17 @@ synchronized public void close() {
         }
     }
 
-    public InstanceCommunication.MetricsData getAndResetMetrics() {
+    // This method is synchronized because it is using the stats variable
+    synchronized public InstanceCommunication.MetricsData getAndResetMetrics() {
         InstanceCommunication.MetricsData metricsData = getMetrics();
-        stats.reset();
+        if (stats != null) {
+            stats.reset();
+        }
         return metricsData;
     }
 
-    public InstanceCommunication.MetricsData getMetrics() {
+    // This method is synchronized because it is using the stats and javaInstance variables
+    synchronized public InstanceCommunication.MetricsData getMetrics() {

Review comment:
       Even though synchronized methods are re-entrant in Java, it still appears a little weird that a synchronized getMetrics() be called inside another synchronized method "getAndResetMetrics()".  Maybe have another "base" private method that gets called by both that is not synchronized.  A side note, I think in the future, we might want to define more interfaces for "modules" in functions, so that it is more clear on what methods should be exposed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] cckellogg commented on a change in pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
cckellogg commented on a change in pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#discussion_r428723696



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java
##########
@@ -801,4 +810,12 @@ public void setupOutput(ContextImpl contextImpl) throws Exception {
             Thread.currentThread().setContextClassLoader(this.instanceClassLoader);
         }
     }
+
+    public String getStatsAsString() throws IOException {
+        if (stats != null) {

Review comment:
       We should synchronize this method as well?  
   
   Also line 241 (run() method) accesses the stats variable and the close() method does too. Difference threads access these methods and change the stats variable. It's the same for other variables in this class. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] cckellogg commented on a change in pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
cckellogg commented on a change in pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#discussion_r428728617



##########
File path: pulsar-functions/instance/src/main/java/org/apache/pulsar/functions/instance/JavaInstanceRunnable.java
##########
@@ -801,4 +810,12 @@ public void setupOutput(ContextImpl contextImpl) throws Exception {
             Thread.currentThread().setContextClassLoader(this.instanceClassLoader);
         }
     }
+
+    public String getStatsAsString() throws IOException {

Review comment:
       We should synchronize this method as well?  
   
   Also line 241 (run() method) accesses the stats variable and the close() method does too. Difference threads access these methods and change the stats variable. It's the same for other variables in this class. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng edited a comment on pull request #7010: Fix null pointer when getting function instance metrics.

Posted by GitBox <gi...@apache.org>.
jerrypeng edited a comment on pull request #7010:
URL: https://github.com/apache/pulsar/pull/7010#issuecomment-632331138


   @srkukarni 
   
   > Some methods are updating multiple counters/gauages. That still has to be in sync block to ensure that we get consistent stats.  
   
   Given a specific point of time, we cannot guarantee that all stats retrieved will be consistent regardless of whether we synchronize the methods or not.  Until less we update all stats at the end of each function invocation in an atomic fashion, there can also be slight deviations. And that is not accounting user stats that can be updated at any time and out of our control.  Adding synchronization on all methods may add overhead and contention that is not necessary.  I don't know of any system that attempts to update all metrics in an atomic fashion.
   
   > The fact that today this class is using an already thread safe data structure is merely detail. If for whatever reason we switch it to some other impl, the methods should still be thread safe. Thus don;t you think its better to declare the methods synchronized explicitly?
   
   Why would we consider using another implementation?  Are there any deficiencies in the current solution? The stats manager is not even an interface in which the user specify the implementation. Since there is not contract with developers, we can change the implementation of how stats are handle whenever we feel like the implementation is not good enough. At that time we can synchronize methods if necessary.  I don't think there is a point to pre-empt something like this given we don't even know what a new stats implementation would even look like.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org