You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by "Navina Ramesh (JIRA)" <ji...@apache.org> on 2015/11/17 23:21:11 UTC
[jira] [Updated] (SAMZA-459) Explicit flush for individual output
streams
[ https://issues.apache.org/jira/browse/SAMZA-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Navina Ramesh updated SAMZA-459:
--------------------------------
Fix Version/s: (was: 0.10.0)
> Explicit flush for individual output streams
> --------------------------------------------
>
> Key: SAMZA-459
> URL: https://issues.apache.org/jira/browse/SAMZA-459
> Project: Samza
> Issue Type: Improvement
> Components: container
> Affects Versions: 0.9.0
> Reporter: Ben Kirwin
> Priority: Minor
>
> From the mailing list:
> http://mail-archives.apache.org/mod_mbox/incubator-samza-dev/201411.mbox/%3CCACuX-D8-CS7867ob47fqytCAdvGURc4owv82Rhg2oEJYmr8hpg%40mail.gmail.com%3E
> At the moment, the only way to trigger a flush of the output streams is to call TaskCoordinator.commit, which also flushes the state and saves the checkpoints. There are a few cases where more granularity would be useful: writing out a single stream can be much faster than doing a full commit, and if a user cares about the order in which messages are published, they can disable the autocommit and trigger flushes manually.
> I'd anticipate this to look something like TaskCoordinator.flush(systemStream). It looks like the TaskCoordinator normally only queues up work, instead of doing it synchronously -- if that's the case, it should be enough to buffer up all the requested flushes, then perform them in order when the moment comes.
> Note: you could get *almost* the same effect by switching to a synchronous system and letting the user send a batch of messages all at once, much as the underlying Kafka client does. This woudn't let you flush a changelog stream, though.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)