You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2014/03/24 22:00:56 UTC

[jira] [Updated] (SAMZA-203) Bad performance in BrokerProxy when restoring changelogs

     [ https://issues.apache.org/jira/browse/SAMZA-203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Riccomini updated SAMZA-203:
----------------------------------

    Attachment: SAMZA-203.0.patch

Attaching patch to fix this. RB at:

https://reviews.apache.org/r/19593/

Changes made:

1. Change fetchThreshold default to 50000.
2. Change fetchThreshold to be divided up per-partition.
3. Fix broker proxy thread names to show host/port again.
4. Change thread sleep in broker proxy to be 100ms not 1000ms.
5. Catch interrupt exceptions in BrokerProxy so we don't pollute STDERR.
6. Fitch samza.fetch.threshold config name to work properly.

No tests yet. Let me know if the changes seem reasonable so far.

> Bad performance in BrokerProxy when restoring changelogs
> --------------------------------------------------------
>
>                 Key: SAMZA-203
>                 URL: https://issues.apache.org/jira/browse/SAMZA-203
>             Project: Samza
>          Issue Type: Bug
>          Components: kv
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>             Fix For: 0.7.0
>
>         Attachments: SAMZA-203.0.patch
>
>
> As part of SAMZA-126, we introduced a Thread.sleep call in BrokerProxy's fetchMessages method. The goal was to skip fetch requests on SimpleConsumer when the topicAndPartitionsToFetch variable was empty. Since we had no topic/partitions to fetch, we slowed the thread down by calling Thread.sleep(sleepMSWhileNoTopicPartitions), which defaults to 1000ms.
> We now see that we are only getting about 1mb/s when restoring changelogs. This is very slow. Upon investigation, it appears that the BrokerProxy thread is sleeping 90% of the time during restore, and the main SamzaContainer thread is polling for more messages about 60% of the time.
> The reason for the poor restore performance is that the BrokerProxy sleeps for 1 second every time the message queue for the restore topic is not empty. Effectively, the proxy starts throttling the reads. If I comment out the Thread.sleep line in the BrokerProxy, I get about 64mb/s network usage on my loopback (one broker running locally), 10mb/s disk read, and 70mb/s disk write on my MacBook Air SSD--the write appears to be the bottleneck (since we're writing all the values to the LevelDB store). This is much much faster than before.



--
This message was sent by Atlassian JIRA
(v6.2#6252)