You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Dieter Plaetinck <di...@intec.ugent.be> on 2011/09/01 10:25:27 UTC

Re: Binary content

On Wed, 31 Aug 2011 08:44:42 -0700
Mohit Anchlia <mo...@gmail.com> wrote:

> Does map-reduce work well with binary contents in the file? This
> binary content is basically some CAD files and map reduce program need
> to read these files using some proprietry tool extract values and do
> some processing. Wondering if there are others doing similar type of
> processing. Best practices etc.

yes, it works.  you just need to select the right input format.
Personally i store all my binary files into a sequencefile (because my binary files are small)

Dieter

Re: Binary content

Posted by Praveen Sripati <pr...@gmail.com>.
Mohit,

"Hadoop: The Definitive Guide" (Chapter 3 - Hadoop I/O) has a section on
SequenceFile and is worth reading.

http://oreilly.com/catalog/9780596521981

Thanks,
Praveen

On Thu, Sep 1, 2011 at 9:15 PM, Owen O'Malley <ow...@hortonworks.com> wrote:

> On Thu, Sep 1, 2011 at 8:37 AM, Mohit Anchlia <mohitanchlia@gmail.com
> >wrote:
>
> Thanks! Is there a specific tutorial I can focus on to see how it could be
> > done?
> >
>
> Take the word count example and change its output format to be
> SequenceFileOutputFormat.
>
> job.setOutputFormatClass(SequenceFileOutputFormat.class);
>
> and it will generate SequenceFiles instead of text. There is
> SequenceFileInputFormat for reading.
>
> -- Owen
>

Re: Binary content

Posted by Owen O'Malley <ow...@hortonworks.com>.
On Thu, Sep 1, 2011 at 8:37 AM, Mohit Anchlia <mo...@gmail.com>wrote:

Thanks! Is there a specific tutorial I can focus on to see how it could be
> done?
>

Take the word count example and change its output format to be
SequenceFileOutputFormat.

job.setOutputFormatClass(SequenceFileOutputFormat.class);

and it will generate SequenceFiles instead of text. There is
SequenceFileInputFormat for reading.

-- Owen

Re: Binary content

Posted by Mohit Anchlia <mo...@gmail.com>.
On Thu, Sep 1, 2011 at 1:25 AM, Dieter Plaetinck
<di...@intec.ugent.be> wrote:
> On Wed, 31 Aug 2011 08:44:42 -0700
> Mohit Anchlia <mo...@gmail.com> wrote:
>
>> Does map-reduce work well with binary contents in the file? This
>> binary content is basically some CAD files and map reduce program need
>> to read these files using some proprietry tool extract values and do
>> some processing. Wondering if there are others doing similar type of
>> processing. Best practices etc.
>
> yes, it works.  you just need to select the right input format.
> Personally i store all my binary files into a sequencefile (because my binary files are small)

Thanks! Is there a specific tutorial I can focus on to see how it could be done?
>
> Dieter
>