You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by Aleksey Maslov <Al...@Lab49.com> on 2011/03/14 21:59:25 UTC

Avro vs. Hadoop serialization performance?

Hi,

Has there been any benshmarking done to determine which serialization
architecture is better - Hadoop vs. Avro;
I understand Avro has language neutrality as its big plus; but what about
the perf?

and yes, its a loaded question -all depends on the nature of the data: text
vs. numeric - but still, are they close?

Aleksey


--
View this message in context: http://apache-avro.679487.n3.nabble.com/Avro-vs-Hadoop-serialization-performance-tp2677357p2677357.html
Sent from the Avro - Users mailing list archive at Nabble.com.

Re: Avro vs. Hadoop serialization performance?

Posted by Scott Carey <sc...@richrelevance.com>.
If you're I/O bound, Avro will be faster.  Avro's raw field serialization
is very fast, but some types of object marshaling are not yet that fast.
Hadoop's Writables aren't all that fast themselves anyway.

I don't know of any public direct benchmarks comparing the two in a
standard Hadoop MapReduce.


When attempted with Pig, Avro was faster (PIG-794):
Storage   Time spent on job_1   Output size of job_1   Mapper task number
of job_2   Time spent on job_2   Total spent time on pig script
AvroStorage   3min 51 sec  7.96G  120 17min 09 sec 21min 0 sec
InterStorage  4min 33 sec  9.55G  143  17min 17 sec  21min 50 sec


On 3/14/11 1:59 PM, "Aleksey Maslov" <Al...@Lab49.com> wrote:

>Hi,
>
>Has there been any benshmarking done to determine which serialization
>architecture is better - Hadoop vs. Avro;
>I understand Avro has language neutrality as its big plus; but what about
>the perf?
>
>and yes, its a loaded question -all depends on the nature of the data:
>text
>vs. numeric - but still, are they close?
>
>Aleksey
>
>
>--
>View this message in context:
>http://apache-avro.679487.n3.nabble.com/Avro-vs-Hadoop-serialization-perfo
>rmance-tp2677357p2677357.html
>Sent from the Avro - Users mailing list archive at Nabble.com.