You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Tao Feng (JIRA)" <ji...@apache.org> on 2015/12/18 23:53:46 UTC

[jira] [Updated] (SAMZA-845) Reduce memory footprint for SystemConsumer

     [ https://issues.apache.org/jira/browse/SAMZA-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tao Feng updated SAMZA-845:
---------------------------
    Description: Currently KafkaSystemConsumer will by default prefetch 50000 messages which is introduced in SAMZA-203. And according to Chris's comment in SAMZA-245 and SAMZA-775 comment, each message potentially will be buffered twice, one in KafkaSystemConsumer(bufferedMessages) and one in SystemConsumers(unprocessedMessagesBySSP). If each message is around 10k byte, we need to have 10k*50k*2 memory to buffer according to the comment. The reason we need to buffer twice is that BrokerProxy will actively fetch message if the total number of messages  below the fetchThreshold(50000) to avoid potentially message latency performance issue and insert into KafkaSystemConsumer's bufferedMessages queue. Whenever message choosing is happen(SystemConsumers.choose is called), it will fetch messages from the bufferedMessages and insert into  its own buffer "unprocessedMessagesBySSP" and later handle over to the streamTask for processing. It will be good to reduce the memory footprint here if this is the case.  I would like to hear from others about whether this is an issue and would like myself to tab on this if this is the case.   (was: Currently KafkaSystemConsumer will by default prefetch 50000 messages which is introduced in SAMZA-203. And according to Chris's comment in SAMZA-245 and SAMZA-775, each message potentially will be buffered twice, one in KafkaSystemConsumer(bufferedMessages) and one in SystemConsumers(unprocessedMessagesBySSP). If each message is around 10k byte, we need to have 10k*50k*2 memory to buffer according to the comment. The reason we need to buffer twice is that BrokerProxy will actively fetch message if the total number of messages  below the fetchThreshold(50000) to avoid potentially message latency performance issue and insert into KafkaSystemConsumer's bufferedMessages queue. Whenever message choosing is happen(SystemConsumers.choose is called), it will fetch messages from the bufferedMessages and insert into  its own buffer "unprocessedMessagesBySSP" and later handle over the streamTask to process. It will be good if we could reduce the memory usage here.  I would like to hear from others about whether this is an issue and would like myself to tab on this if it is. )

> Reduce memory footprint for SystemConsumer
> ------------------------------------------
>
>                 Key: SAMZA-845
>                 URL: https://issues.apache.org/jira/browse/SAMZA-845
>             Project: Samza
>          Issue Type: Improvement
>            Reporter: Tao Feng
>
> Currently KafkaSystemConsumer will by default prefetch 50000 messages which is introduced in SAMZA-203. And according to Chris's comment in SAMZA-245 and SAMZA-775 comment, each message potentially will be buffered twice, one in KafkaSystemConsumer(bufferedMessages) and one in SystemConsumers(unprocessedMessagesBySSP). If each message is around 10k byte, we need to have 10k*50k*2 memory to buffer according to the comment. The reason we need to buffer twice is that BrokerProxy will actively fetch message if the total number of messages  below the fetchThreshold(50000) to avoid potentially message latency performance issue and insert into KafkaSystemConsumer's bufferedMessages queue. Whenever message choosing is happen(SystemConsumers.choose is called), it will fetch messages from the bufferedMessages and insert into  its own buffer "unprocessedMessagesBySSP" and later handle over to the streamTask for processing. It will be good to reduce the memory footprint here if this is the case.  I would like to hear from others about whether this is an issue and would like myself to tab on this if this is the case. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)