You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Navina Ramesh (JIRA)" <ji...@apache.org> on 2017/02/09 00:26:42 UTC
[jira] [Commented] (SAMZA-1084) User thread does not see errors from the processor thread when using the StreamProcessor API

    [ https://issues.apache.org/jira/browse/SAMZA-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858759#comment-15858759 ] 

Navina Ramesh commented on SAMZA-1084:
--------------------------------------

Discussed offline with Xinyu. We narrowed down to 2 options for changing the implementation:
# Run container in a processor managed thread-pool and return the Future handle of the container to the caller. Let the caller decide whether to block or not on the future. We can still have the awaitStart() API and use the future handle to block and handle errors/completion (potentially in a different thread than the caller).
* Pros: 
** We still have an asynchronous start method and a way to guarantee start-up using awaitStart
** Caller has a choice on whether to block for completion of the processor or not
** Caller can also poll for status using isDone method of the future handle. (status is kind of crude. But it is still some status than having no status).
* Cons:
** It is assumed that having a single thread executor in StreamProcessor and a task threadPool within the container does not have significant performance impact.
** Exceptions thrown are only seen when the user blocks on <futurehandle>.get() and the original exception is wrapped. 

2. Run the container in the same thread as the processor. This makes start() a blocking call and it is left up-to the caller to manage the thread execution. 
* Pros:
** Simple solution that make error handling more straight-forward - Exceptions thrown are not wrapped under "ExecutionException" or "InterruptedException".
** Since caller manages the thread, the caller knows the status of the processor
* Cons:
** We now have blocking api calls. Not necessarily a drawback since it is not a public api (at least not yet)

Adding callbacks or listeners (like onStart/onError/onShutdown) results in callbacks being handled in a different thread than the caller's thread. Not sure if that is in any way more useful than the above models. 

[~nickpan47] [~xinyu] [~boryas] Any thoughts on these options ? Or have other alternatives?


> User thread does not see errors from the processor thread when using the StreamProcessor API
> --------------------------------------------------------------------------------------------
>
>                 Key: SAMZA-1084
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1084
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Navina Ramesh
>            Assignee: Navina Ramesh
>             Fix For: 0.13.0
>
>
> The current user model for StreamProcessor API allows the user to start (asynchronous) and stop processor. awaitStart allows the user to wait until the processor actually initializes and starts processing messages. There are certain limitations to this API:
> 1. In case the processor fails during processing or prior to processing, the error is not propagated to the user context. It is very hard to troubleshoot and take action on error/shutdown.
> 2. There is also no way for the user to continuously check the status of the processor.
> 3. (More of implementation detail than an API issue) Another issue is that we are using an Executor to run the container in a separate thread. The container uses another Executor to run the tasks. We need to understand the performance impact of using 2 levels of Executors in a single StreamProcessor instance. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)