You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Periya.Data" <pe...@gmail.com> on 2011/09/15 23:19:42 UTC
example of splitting a binary file
Hi all,
Is there a nice example that shows how to split a large binary file into
splits? If there is one, please let me know. It will be a great place to for
me to start.
More ideally, I want to create a custom InputFormat from
sequenceFileAsBinaryInputFormat and a custom record-reader that can properly
read well-defined records (with known offsets) in my binary input file.
But, for now, to begin, I want to learn the basics => read a binary file,
break it into splits of known size and play with a record-reader and get
some output. I do not want to do any map-reduce yet on them. Once I know how
to do those, I can gradually build on it.
Please let me know if there are any links to such examples.
Thanks,
PD.