You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Mohit Anchlia <mo...@gmail.com> on 2011/08/31 17:44:42 UTC

Binary content

Does map-reduce work well with binary contents in the file? This
binary content is basically some CAD files and map reduce program need
to read these files using some proprietry tool extract values and do
some processing. Wondering if there are others doing similar type of
processing. Best practices etc.

Re: Binary content

Posted by Praveen Sripati <pr...@gmail.com>.
Mohit,

"Hadoop: The Definitive Guide" (Chapter 3 - Hadoop I/O) has a section on
SequenceFile and is worth reading.

http://oreilly.com/catalog/9780596521981

Thanks,
Praveen

On Thu, Sep 1, 2011 at 9:15 PM, Owen O'Malley <ow...@hortonworks.com> wrote:

> On Thu, Sep 1, 2011 at 8:37 AM, Mohit Anchlia <mohitanchlia@gmail.com
> >wrote:
>
> Thanks! Is there a specific tutorial I can focus on to see how it could be
> > done?
> >
>
> Take the word count example and change its output format to be
> SequenceFileOutputFormat.
>
> job.setOutputFormatClass(SequenceFileOutputFormat.class);
>
> and it will generate SequenceFiles instead of text. There is
> SequenceFileInputFormat for reading.
>
> -- Owen
>

Re: Binary content

Posted by Owen O'Malley <ow...@hortonworks.com>.
On Thu, Sep 1, 2011 at 8:37 AM, Mohit Anchlia <mo...@gmail.com>wrote:

Thanks! Is there a specific tutorial I can focus on to see how it could be
> done?
>

Take the word count example and change its output format to be
SequenceFileOutputFormat.

job.setOutputFormatClass(SequenceFileOutputFormat.class);

and it will generate SequenceFiles instead of text. There is
SequenceFileInputFormat for reading.

-- Owen

Re: Binary content

Posted by Mohit Anchlia <mo...@gmail.com>.
On Thu, Sep 1, 2011 at 1:25 AM, Dieter Plaetinck
<di...@intec.ugent.be> wrote:
> On Wed, 31 Aug 2011 08:44:42 -0700
> Mohit Anchlia <mo...@gmail.com> wrote:
>
>> Does map-reduce work well with binary contents in the file? This
>> binary content is basically some CAD files and map reduce program need
>> to read these files using some proprietry tool extract values and do
>> some processing. Wondering if there are others doing similar type of
>> processing. Best practices etc.
>
> yes, it works.  you just need to select the right input format.
> Personally i store all my binary files into a sequencefile (because my binary files are small)

Thanks! Is there a specific tutorial I can focus on to see how it could be done?
>
> Dieter
>

Re: Binary content

Posted by Dieter Plaetinck <di...@intec.ugent.be>.
On Wed, 31 Aug 2011 08:44:42 -0700
Mohit Anchlia <mo...@gmail.com> wrote:

> Does map-reduce work well with binary contents in the file? This
> binary content is basically some CAD files and map reduce program need
> to read these files using some proprietry tool extract values and do
> some processing. Wondering if there are others doing similar type of
> processing. Best practices etc.

yes, it works.  you just need to select the right input format.
Personally i store all my binary files into a sequencefile (because my binary files are small)

Dieter