You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Maier, Dr. Andreas" <an...@asideas.de> on 2013/09/04 14:28:36 UTC

Sending huge binary files via Kafka?

Hello,

I have a proposal for an architecture on my desk, where people want
to store huge binary files (like images and videos up to a size of several
GB) 
in RiakCS. But the connection to RiakCS is supposed to work through Apache
Kafka,
so there will be a Kafka producer fetching the files from the source and
sending them to a Kafka-RiakCS consumer.
Now when I look into the Kafka configuration options
(http://kafka.apache.org/08/configuration.html)
I see that message.max.bytes is 1000000 by default, which would be much
too 
small for huge binary files like videos.
So my questions are:
Can this size be increased to support also messages with a size
of several GB? Has anyone already tried this? Are Kafka brokers,
consumers and producers able to handle such a message size?
Will setting such a huge limit on the message size have any impact
on the performance of transporting smaller messages?
Or should we better let our Kafka producers bypass Kafka, when
they encounter such huge binary files at the source and
let them store these files directly in RiakCS?

Best Regards,

Andreas Maier


Re: Sending huge binary files via Kafka?

Posted by "Maier, Dr. Andreas" <an...@asideas.de>.
Magnus,

this sounds like an interesting idea. But at the actual state of Kafka
this would mean, that I would have to extend the Kafka Producer and
Consumer 
classes on my own to support that kind of message/file transfer, wouldn't
it? 
This is a bit too much effort for me at the moment.
But of course it would be nice, if Kafka Producers or Consumers would
support
zero-copy file transfer natively.

At the moment I'm more thinking about sending a message to the consumer
with an URL of the huge binary file, and let the consumer fetch the file
from that
URL directly. By that we would use Kafka only for sending a notification,
that
a new file exists at the source. The real file transfer would bypass
the Kafka queue.

Andreas Maier



Am 05.09.13 12:28 schrieb "Magnus Edenhill" unter <ma...@edenhill.se>:

>It would be possible to modify an existing client implementation to use
>sendfile(2) to pass message contents from/to the filesystem rather than
>(pre-)allocating
>send&receive buffers, Thus providing zero-copy file transfer over Kafka.
>I believe this is how the broker is implemented.
>
>Regards,
>Magnus
>
>
>2013/9/4 Neha Narkhede <ne...@gmail.com>
>
>> The message size limit is imposed to protect the brokers and consumers
>>from
>> running out of memory. The consumer does not have support for streaming
>>a
>> message and has to allocate memory to be able to read the largest
>>message.
>> You could try compressing the files but I'm not sure if that will get
>>you
>> as much space saving to make it feasible for Kafka usage.
>>
>> Thanks,
>> Neha
>> On Sep 4, 2013 5:29 AM, "Maier, Dr. Andreas" <an...@asideas.de>
>> wrote:
>>
>> > Hello,
>> >
>> > I have a proposal for an architecture on my desk, where people want
>> > to store huge binary files (like images and videos up to a size of
>> several
>> > GB)
>> > in RiakCS. But the connection to RiakCS is supposed to work through
>> Apache
>> > Kafka,
>> > so there will be a Kafka producer fetching the files from the source
>>and
>> > sending them to a Kafka-RiakCS consumer.
>> > Now when I look into the Kafka configuration options
>> > (http://kafka.apache.org/08/configuration.html)
>> > I see that message.max.bytes is 1000000 by default, which would be
>>much
>> > too
>> > small for huge binary files like videos.
>> > So my questions are:
>> > Can this size be increased to support also messages with a size
>> > of several GB? Has anyone already tried this? Are Kafka brokers,
>> > consumers and producers able to handle such a message size?
>> > Will setting such a huge limit on the message size have any impact
>> > on the performance of transporting smaller messages?
>> > Or should we better let our Kafka producers bypass Kafka, when
>> > they encounter such huge binary files at the source and
>> > let them store these files directly in RiakCS?
>> >
>> > Best Regards,
>> >
>> > Andreas Maier
>> >
>> >
>>


Re: Sending huge binary files via Kafka?

Posted by Magnus Edenhill <ma...@edenhill.se>.
It would be possible to modify an existing client implementation to use
sendfile(2) to pass message contents from/to the filesystem rather than
(pre-)allocating
send&receive buffers, Thus providing zero-copy file transfer over Kafka.
I believe this is how the broker is implemented.

Regards,
Magnus


2013/9/4 Neha Narkhede <ne...@gmail.com>

> The message size limit is imposed to protect the brokers and consumers from
> running out of memory. The consumer does not have support for streaming a
> message and has to allocate memory to be able to read the largest message.
> You could try compressing the files but I'm not sure if that will get you
> as much space saving to make it feasible for Kafka usage.
>
> Thanks,
> Neha
> On Sep 4, 2013 5:29 AM, "Maier, Dr. Andreas" <an...@asideas.de>
> wrote:
>
> > Hello,
> >
> > I have a proposal for an architecture on my desk, where people want
> > to store huge binary files (like images and videos up to a size of
> several
> > GB)
> > in RiakCS. But the connection to RiakCS is supposed to work through
> Apache
> > Kafka,
> > so there will be a Kafka producer fetching the files from the source and
> > sending them to a Kafka-RiakCS consumer.
> > Now when I look into the Kafka configuration options
> > (http://kafka.apache.org/08/configuration.html)
> > I see that message.max.bytes is 1000000 by default, which would be much
> > too
> > small for huge binary files like videos.
> > So my questions are:
> > Can this size be increased to support also messages with a size
> > of several GB? Has anyone already tried this? Are Kafka brokers,
> > consumers and producers able to handle such a message size?
> > Will setting such a huge limit on the message size have any impact
> > on the performance of transporting smaller messages?
> > Or should we better let our Kafka producers bypass Kafka, when
> > they encounter such huge binary files at the source and
> > let them store these files directly in RiakCS?
> >
> > Best Regards,
> >
> > Andreas Maier
> >
> >
>

Re: Sending huge binary files via Kafka?

Posted by Neha Narkhede <ne...@gmail.com>.
The message size limit is imposed to protect the brokers and consumers from
running out of memory. The consumer does not have support for streaming a
message and has to allocate memory to be able to read the largest message.
You could try compressing the files but I'm not sure if that will get you
as much space saving to make it feasible for Kafka usage.

Thanks,
Neha
On Sep 4, 2013 5:29 AM, "Maier, Dr. Andreas" <an...@asideas.de>
wrote:

> Hello,
>
> I have a proposal for an architecture on my desk, where people want
> to store huge binary files (like images and videos up to a size of several
> GB)
> in RiakCS. But the connection to RiakCS is supposed to work through Apache
> Kafka,
> so there will be a Kafka producer fetching the files from the source and
> sending them to a Kafka-RiakCS consumer.
> Now when I look into the Kafka configuration options
> (http://kafka.apache.org/08/configuration.html)
> I see that message.max.bytes is 1000000 by default, which would be much
> too
> small for huge binary files like videos.
> So my questions are:
> Can this size be increased to support also messages with a size
> of several GB? Has anyone already tried this? Are Kafka brokers,
> consumers and producers able to handle such a message size?
> Will setting such a huge limit on the message size have any impact
> on the performance of transporting smaller messages?
> Or should we better let our Kafka producers bypass Kafka, when
> they encounter such huge binary files at the source and
> let them store these files directly in RiakCS?
>
> Best Regards,
>
> Andreas Maier
>
>