You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Prateek Maheshwari (JIRA)" <ji...@apache.org> on 2019/01/25 17:56:00 UTC

[jira] [Commented] (SAMZA-2087) Use separate thread pools for AsyncStreamTaskAdapter and AsyncRunLoop

    [ https://issues.apache.org/jira/browse/SAMZA-2087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16752513#comment-16752513 ] 

Prateek Maheshwari commented on SAMZA-2087:
-------------------------------------------

[~bharathkk] If I understand correctly, your proposal is to use a separate executor with 'job.container.thread.pool.size' threads for the task processing, which potentially increases the number of available threads, up to a full 'job.container.thread.pool.size' amount. Two questions:
1. Isn't this approx. (excluding commit/window) the same number of threads they'll currently have for task.process()?

2. This number may still be configured to be less than task.max.concurrency, in which case users will have the same issue. How does separating the thread pools help here? Would a better approach be to call out this dependency explicitly in configuration reference?

> Use separate thread pools for AsyncStreamTaskAdapter and AsyncRunLoop
> ---------------------------------------------------------------------
>
>                 Key: SAMZA-2087
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2087
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>            Reporter: Bharath Kumarasubramanian
>            Assignee: Bharath Kumarasubramanian
>            Priority: Major
>
> Currently, AsyncStreamTaskAdapter and AsyncRunLoop share thread pools and the thread pool size is governed by {{job.container.thread.pool.size}}. This introduces disparity in the semantics of {{task.max.concurrency}} between async stream task vs adapted async stream task.
> task.max.concurrency governs the parallelism within a task. In case of applications that directly implement AsyncStreamTask, this corresponds to maximum number of inflight messages for a task. However, in the case of applications that get adapted to AsyncStreamTask, the maximum number of inflight messages are bounded by the job.container.thread.pool.size. In fact, all the tasks within a container share the thread pool which results in even lesser parallelism within a task even though applications configure task.max.concurrency to a much greater number than job.container.thread.pool.size
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)