You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Tao Feng (JIRA)" <ji...@apache.org> on 2015/12/18 23:51:46 UTC

[jira] [Created] (SAMZA-845) Reduce memory footprint for SystemConsumer

Tao Feng created SAMZA-845:
------------------------------

             Summary: Reduce memory footprint for SystemConsumer
                 Key: SAMZA-845
                 URL: https://issues.apache.org/jira/browse/SAMZA-845
             Project: Samza
          Issue Type: Improvement
            Reporter: Tao Feng


Currently KafkaSystemConsumer will by default prefetch 50000 messages which is introduced in SAMZA-203. And according to Chris's comment in SAMZA-245 and SAMZA-775, each message potentially will be buffered twice, one in KafkaSystemConsumer(bufferedMessages) and one in SystemConsumers(unprocessedMessagesBySSP). If each message is around 10k byte, we need to have 10k*50k*2 memory to buffer according to the comment. The reason we need to buffer twice is that BrokerProxy will actively fetch message if the total number of messages  below the fetchThreshold(50000) to avoid potentially message latency performance issue and insert into KafkaSystemConsumer's bufferedMessages queue. Whenever message choosing is happen(SystemConsumers.choose is called), it will fetch messages from the bufferedMessages and insert into  its own buffer "unprocessedMessagesBySSP" and later handle over the streamTask to process. It will be good if we could reduce the memory usage here.  I would like to hear from others about whether this is an issue and would like myself to tab on this if it is. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)