You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Jake Maes (JIRA)" <ji...@apache.org> on 2017/06/06 17:24:18 UTC

[jira] [Updated] (SAMZA-1323) Reduce Timer memory usage

     [ https://issues.apache.org/jira/browse/SAMZA-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jake Maes updated SAMZA-1323:
-----------------------------
    Description: 
Unlike Counters and Gauges, Timers (SlidingTimeWindowReservoir) retain individual data points over a sliding window. 

Most timers are instantiated on the order of the number of tasks. With the high level API it's per operator per task. So the number of data points used for each timer is:

NumOperatorsPerTask * NumTasks * TimerCollisionBuffer * TimerWindowSize * %WindowFill. 

Values for a typical application could be:
4 * 128 * 1 * 300000 * 1 = 154M time-value pairs if the window is always filled (e.g. because we process a message every ms)

The timers are useful, but they're resource hungry.

SAMZA-834 reduced the memory usage by changing the default collision buffer from 256 

Users can also disable the timers using the config:
metrics.timer.enabled=false

But the goal of this ticket is to explore ways to further reduce the memory overhead of timers so users don't have to turn them off. 

A naive way is to reduce the window size, but ideally a more intelligent or coarser-grained (bucketed) sliding window implementation could be employed. 


  was:
Unlike Counters and Gauges, Timers (SlidingTimeWindowReservoir) retain individual data points over a sliding window. 

Most timers are instantiated on the order of the number of tasks. With the high level API it's per operator per task. So the number of data points used for each timer is:

NumOperatorsPerTask * NumTasks * TimerCollisionBuffer * TimerWindowSize * %WindowFill. 

Values for a typical application could be:
4 * 128 * 1 * 300000 * 1 = 154M time-value pairs if the window is always filled (e.g. because we process a message every ms)

The timers are useful, but they're resource hungry.

SAMZA-834 reduced the memory usage by changing the default collision buffer from 256 

Users can also disable the timers using the config:
metrics.timer.enabled=false

But the goal of this ticket is to explore ways to further reduce the memory overhead of timers so users don't have to turn them off. 



> Reduce Timer memory usage
> -------------------------
>
>                 Key: SAMZA-1323
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1323
>             Project: Samza
>          Issue Type: Bug
>            Reporter: Jake Maes
>
> Unlike Counters and Gauges, Timers (SlidingTimeWindowReservoir) retain individual data points over a sliding window. 
> Most timers are instantiated on the order of the number of tasks. With the high level API it's per operator per task. So the number of data points used for each timer is:
> NumOperatorsPerTask * NumTasks * TimerCollisionBuffer * TimerWindowSize * %WindowFill. 
> Values for a typical application could be:
> 4 * 128 * 1 * 300000 * 1 = 154M time-value pairs if the window is always filled (e.g. because we process a message every ms)
> The timers are useful, but they're resource hungry.
> SAMZA-834 reduced the memory usage by changing the default collision buffer from 256 
> Users can also disable the timers using the config:
> metrics.timer.enabled=false
> But the goal of this ticket is to explore ways to further reduce the memory overhead of timers so users don't have to turn them off. 
> A naive way is to reduce the window size, but ideally a more intelligent or coarser-grained (bucketed) sliding window implementation could be employed. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)