You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/05/18 22:21:15 UTC

[GitHub] [pulsar] nlu90 opened a new pull request #10631: reorganize the context hierarchy for functions

nlu90 opened a new pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631


   ### Motivation
   
   Currently the context relationship for function, source and sink is not well defined. This prevents some common features to be added once for all and creates some confusion, code duplication in the current repo. As demonstrated in the following graph, this PR changes the hierarchy from left to right. By introducing a common base context, it help solving some issues we are seeing. The base context provides common access to pulsar cluster, state, metrics, and meta-data to make sure all components can reuse it.
   
   ![context hierarchy](https://user-images.githubusercontent.com/16407807/118730483-8ebf5200-b7ec-11eb-9220-d41261f148bb.png)
   
   
   
   ### Modifications
   
   - Remove `ConnectorContext` interface.
   - Introduce a `BaseContext` interface. 
   - Update existing `Context`, `SourceContext`, `SinkContext` interface to extend the new common interface.
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ### Does this pull request potentially affect one of the following parts:
   
   *If `yes` was chosen, please highlight the changes*
   
     - Dependencies (does it add or upgrade a dependency): (yes / no)
     - **The public API:**
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631#issuecomment-858151590


   @nlu90 
   
   > Based on my understanding, these are pulsar cluster interaction methods that connectors may need. And actually, I'm planning to add one more interface getPulsarClient as a following PR to fix #8668 more concretely.
   
   1. I think the goal and scope of this PR is  refactor some of the code for various context interfaces so there is less duplicated code. Let's make the changes that are within that scope.   Let's not try to do too many things in one PR.
   
   2. If we want to expose more of the interfaces of sources and sinks.  Let's have a clear use case in mind before exposing additional interfaces.  I am not in favor of just bulk adding additional interfaces for sources and sinks.  I don't want to maintain functionality in sources and sinks that is not useful.  For example, for a sink, is there a concrete use case in which a sink needs to publish a message to another pulsar topic?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] nlu90 commented on pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
nlu90 commented on pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631#issuecomment-865359519


   @sijie I rebased the master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] nlu90 commented on pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
nlu90 commented on pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631#issuecomment-865359519


   @sijie I rebased the master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] nlu90 commented on pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
nlu90 commented on pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631#issuecomment-856215827


   > While I am ok with doing some refactoring in the interfaces we have for Source, Sink, and Function context, this PR also exposes interfaces that are originally only for FunctionContext to SourceContext and SinkContext. We should consider carefully before introducing these new interfaces in Source and Sink Context. I don't want to expose functionality that is not needed.
   > 
   > If this is just a refactor, please keep the current interfaces exposed for FunctionContext, SourceContext, and SinkContext.
   
   @jerrypeng Thanks for the review.  Despite the previous shared interfaces, the followings new interfaces are exposed to connectors:
   
   ```
   default <S extends StateStore> S getStateStore(String name) {
           throw new UnsupportedOperationException("Component cannot get state store");
       }
   
   default PulsarAdmin getPulsarAdmin() {
           throw new UnsupportedOperationException("Component cannot access pulsar admin");
       }
       
   default PulsarAdmin getPulsarAdmin(String clusterName) {
           throw new UnsupportedOperationException("Component cannot access pulsar admin");
       }
   
   default <O> CompletableFuture<Void> publish(String topicName, O object, String schemaOrSerdeClassName) {
           throw new UnsupportedOperationException("Component cannot publish message");
       }
   
   default <O> CompletableFuture<Void> publish(String topicName, O object) {
           throw new UnsupportedOperationException("Component cannot publish message");
       }
   
   default <O> TypedMessageBuilder<O> newOutputMessage(String clusterName, String topicName, Schema<O> schema)
               throws PulsarClientException {
           throw new UnsupportedOperationException("Component can not output message to pulsar cluster");
       }
       
   default <O> TypedMessageBuilder<O> newOutputMessage(String topicName, Schema<O> schema)
               throws PulsarClientException {
           throw new UnsupportedOperationException("Component can not output message to pulsar cluster");
       }
   
   default <O> ConsumerBuilder<O> newConsumerBuilder(Schema<O> schema) throws PulsarClientException {
           throw new UnsupportedOperationException("Component can not create consumer builder");
       }
   ```
   
   Based on my understanding, these are pulsar cluster interaction methods that connectors may need. And actually, I'm planning to add one more interface `getPulsarClient` as a following PR to fix https://github.com/apache/pulsar/pull/8668 more concretely. 
   
   But I might be missing some of the context here, so please let me know which of the newly exposed functionality is not needed and we can discuss and update accordingly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on a change in pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on a change in pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631#discussion_r645873174



##########
File path: pulsar-client-1x-base/pulsar-client-1x/src/main/java/org/apache/pulsar/client/api/Producer.java
##########
@@ -84,7 +84,7 @@
     void flush() throws PulsarClientException;
 
     /**
-     * Flush all the messages buffered in the client and wait until all messages have been successfully persisted.
+     *  Flush all the messages buffered in the client Asynchronously.

Review comment:
       nit. Asynchronously -> asynchronously




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] nlu90 commented on pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
nlu90 commented on pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631#issuecomment-858267166


   > @nlu90
   > 
   > > Based on my understanding, these are pulsar cluster interaction methods that connectors may need. And actually, I'm planning to add one more interface getPulsarClient as a following PR to fix #8668 more concretely.
   > 
   > 1. I think the goal and scope of this PR is  refactor some of the code for various context interfaces so there is less duplicated code. Let's make the changes that are within that scope.   Let's not try to do too many things in one PR.
   > 2. If we want to expose more of the interfaces of sources and sinks.  Let's have a clear use case in mind before exposing additional interfaces.  I am not in favor of just bulk adding additional interfaces for sources and sinks.  I don't want to maintain functionality in sources and sinks that is not useful.  For example, for a sink, is there a concrete use case in which a sink needs to publish a message to another pulsar topic?
   
   Sounds good to me.
   
   I'll limit the scope of this PR for only refactoring apis. Keep those listed apis only in the FunctionContext. And send additional PRs if apis are needed.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
sijie commented on pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631#issuecomment-863787516


   @nlu90 Can you rebase the PR to the latest master?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] jerrypeng commented on a change in pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
jerrypeng commented on a change in pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631#discussion_r645876916



##########
File path: pulsar-functions/api-java/src/main/java/org/apache/pulsar/functions/api/Context.java
##########
@@ -40,15 +36,8 @@
  * executing function
  */
 @InterfaceAudience.Public
-@InterfaceStability.Stable
-public interface Context {
-    /**
-     * Access the record associated with the current input value.
-     *
-     * @return
-     */
-    Record<?> getCurrentRecord();
-
+@InterfaceStability.Unstable

Review comment:
       why are changing the InterfaceStability level?  Is this change no backwards compatible? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie merged pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
sijie merged pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] nlu90 commented on pull request #10631: [Functions] reorganize the context hierarchy for functions

Posted by GitBox <gi...@apache.org>.
nlu90 commented on pull request #10631:
URL: https://github.com/apache/pulsar/pull/10631#issuecomment-866382985


   @jerrypeng @eolivelli
   
   Could you take another look at the PR?  I've updated it based on the comments and passed all the CI check


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org