You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Mingjie Lai <mj...@gmail.com> on 2011/11/03 21:28:42 UTC

Re: Logging AVRO-Binary directly?

Sorry for the late. Please see my response inline.

> So you would log the avro data raw in the message body?

Yes. I thought you only use avro to encode log messages, right? You 
won't use it for RPC, right?

> I thought this would be a problem.
> I mean flume has to know how to handle the data, or am I wrong?

I would use flume just working as a ``pipe'', which is only responsible 
for moving data. It doesn't have to understand the contend of the data.

> Filtering would be nice too, but that's not a hard requirement at the
> beginning.

If you want to do filtering, it would be a different story since a 
decorator has to understand the content of data. but still you don't 
have to need an avro source which knows the schema.

> As far as I can see avro is meant to encode single messages but rather
> files/streams.

AFAIK, it's not only for encoding single messages. It can be used to 
encode files, RPC, etc.

 > So there seems to be no way to encode a single binary
 > message into a string.

I don't quite understand. Flume and avro all handles data in binary. You 
don't need to worry about string, right?

> I think it is this way because the first package holds all the meta
> information (delimiters, schema).

> As far as I understand this jazz, there should be an avro source which
> understands my format (and is compatible with the original avro formats
> structure) to decode the messages into flume.
> But there is no avro source I can pass a schema to.

As I mentioned before, flume can just treat the avro messages as byte 
messages, and put them to somewhere(hdfs?) for further analytics.

-Mingjie