You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Ashish Paliwal (JIRA)" <ji...@apache.org> on 2014/11/05 10:50:34 UTC

[jira] [Resolved] (FLUME-559) Add compression and batching features to rpcsinks and rpcsources.

     [ https://issues.apache.org/jira/browse/FLUME-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Paliwal resolved FLUME-559.
----------------------------------
       Resolution: Won't Fix
    Fix Version/s: v0.9.5

Won't fix. 0.X branch not maintained anymore

> Add compression and batching features to rpcsinks and rpcsources.
> -----------------------------------------------------------------
>
>                 Key: FLUME-559
>                 URL: https://issues.apache.org/jira/browse/FLUME-559
>             Project: Flume
>          Issue Type: New Feature
>    Affects Versions: v0.9.4
>            Reporter: Jonathan Hsieh
>             Fix For: v0.9.5
>
>
> Currently batching and compression options can be specified as data flow elements (decorators) but there are subtle issues that make them difficult to use effectively, especially in the e2e case.  
> The proposal here is to add compression and batching features to the rpc sinks.  This will likely require the addition of a "flush" or "sync" call to the sink/decorator interface.  However, this will greatly simplify the use of these optimizations from a user perspective.
> Here are some examples:
> This is ok:
> {code}
> batch(100) gzip rpcSink("xxx",1234)
> {code}
> In the new implementation it would be something like
> {code}
> rpcSink("xxx",1234, compression="gzip", batch="count(100)")
> {code}
> Ideally the rpcSource's will be able to just accept compressed or batched data.
> Here's an example of thinks that seem inconsistent an take too long to explain (and thus is too complicated)
> Today, this should work, essentially as expected:
> {code}
> agent : source | batch(100) gzip agentBESink("collector");
> collector : collectorSource | gunzip unbatch collectorSink("XXX");
> {code}
> This works, but may not work the one would expect (in the batching buffer can get lost becuase the wal happens after batching/gziping).
> {code}
> agent : source | batch(100) gzip agentE2ESink("collector");
> collector : collectorSource | gunzip unbatch collectorSink("XXX");
> {code}
> This one will not work. (compressed events have 0 size body, acks work on bodies, thus acks are worthless).
> {code}
> agent : source | batch(100) gzip agentE2ESink("collector");
> collector : collectorSource | collector(30000) { gunzip unbatch escapedCustomDfs("XXX","yyy") };
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)