You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Manoj Babu <ma...@gmail.com> on 2012/07/15 18:37:42 UTC

Processing Large XML in Hadoop

Hi,

Could you kindly explain the pros and cons of using Hadoop's
StreamInputFormat and Mahout XmlInputFormat.
How the record reader reads the record if it across the other blocks when
dealing with large size xml files?

Thanks in advance.

Cheers!
Manoj.