You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Shanthoosh Venkataraman (JIRA)" <ji...@apache.org> on 2019/07/31 00:27:00 UTC

[jira] [Resolved] (SAMZA-2284) Remove redundant stream metadata API invocations in SamzaContainer startup sequence.

     [ https://issues.apache.org/jira/browse/SAMZA-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shanthoosh Venkataraman resolved SAMZA-2284.
--------------------------------------------
    Resolution: Fixed

> Remove redundant stream metadata API invocations in SamzaContainer startup sequence.
> ------------------------------------------------------------------------------------
>
>                 Key: SAMZA-2284
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2284
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Shanthoosh Venkataraman
>            Assignee: Shanthoosh Venkataraman
>            Priority: Major
>          Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
>  
> SamzaContainer startup sequence fetches the metadata of same input streams multiple times. Fetching the metadata of a stream entails making a remote call to underlying messaging broker and is very expensive. This redundant fetch-input-stream-metadata API invocations incurred significant delays in the start of actual message processing by the samza job.
> Impact:
> 1. With some samza jobs at LinkedIn, we observed that this fetch-input-stream-metadata loop took around 1.5 hrs to complete.
>  2. The redundant fetch-input-stream-metadata remote API calls will increase the load on the underlying messaging broker significantly.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)