You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2013/09/13 02:08:51 UTC

[jira] [Updated] (SAMZA-2) Fine-grain control over stream consumption

     [ https://issues.apache.org/jira/browse/SAMZA-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Riccomini updated SAMZA-2:
--------------------------------

    Description: 
Currently, samza exposes configuration in the form of "streams.%s.consumer.max.bytes.per.sec" for throttling the # of bytes the Task will read from a stream. This is a feature request for programmatic fine-grain control over stream consumption. The use-case is a samza task that will be consuming multiple streams where some streams may be from live systems that have stricter SLA requirements and must always be prioritized over other streams that may be from batch systems. The above configuration is not the ideal way to express this type of stream prioritization because configuring the "batch" streams with a low consumption rate will decrease the overall throughput of the system when there is no data in the "live" streams. Furthermore, we'll want to throttle each "batch" stream based on external signals that can change over time. Because of the dynamic nature of these external signals, we would like to have a programmatic interface that can dynamically change the prioritization as the signal changes.

Design proposal:

https://wiki.apache.org/samza/Pluggable%20MessageChooser

  was:
Currently, samza exposes configuration in the form of "streams.%s.consumer.max.bytes.per.sec" for throttling the # of bytes the Task will read from a stream. This is a feature request for programmatic fine-grain control over stream consumption. The use-case is a samza task that will be consuming multiple streams where some streams may be from live systems that have stricter SLA requirements and must always be prioritized over other streams that may be from batch systems. The above configuration is not the ideal way to express this type of stream prioritization because configuring the "batch" streams with a low consumption rate will decrease the overall throughput of the system when there is no data in the "live" streams. Furthermore, we'll want to throttle each "batch" stream based on external signals that can change over time. Because of the dynamic nature of these external signals, we would like to have a programmatic interface that can dynamically change the prioritization as the signal changes.


    
> Fine-grain control over stream consumption
> ------------------------------------------
>
>                 Key: SAMZA-2
>                 URL: https://issues.apache.org/jira/browse/SAMZA-2
>             Project: Samza
>          Issue Type: Bug
>          Components: container
>    Affects Versions: 0.6.0
>            Reporter: Chris Riccomini
>            Assignee: Chris Riccomini
>             Fix For: 0.7.0
>
>         Attachments: SAMZA-2.0.patch, SAMZA-2.1.patch
>
>
> Currently, samza exposes configuration in the form of "streams.%s.consumer.max.bytes.per.sec" for throttling the # of bytes the Task will read from a stream. This is a feature request for programmatic fine-grain control over stream consumption. The use-case is a samza task that will be consuming multiple streams where some streams may be from live systems that have stricter SLA requirements and must always be prioritized over other streams that may be from batch systems. The above configuration is not the ideal way to express this type of stream prioritization because configuring the "batch" streams with a low consumption rate will decrease the overall throughput of the system when there is no data in the "live" streams. Furthermore, we'll want to throttle each "batch" stream based on external signals that can change over time. Because of the dynamic nature of these external signals, we would like to have a programmatic interface that can dynamically change the prioritization as the signal changes.
> Design proposal:
> https://wiki.apache.org/samza/Pluggable%20MessageChooser

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira