You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by HUYLEBROECK Jeremy RD-ILAB-SSF <je...@orange-ft.com> on 2006/08/29 03:41:00 UTC

Hadoop job question

I currently have a MR task that reads a SequenceFile via the map method
to output some data.

My goal is to output some data to MySQL but I'd like to read several
records before doing the INSERT.

But I can't figure out how to get several records...
They all have different keys so the reduce task only gets one at a time.


Thanks for any help!


Re: Hadoop job question

Posted by Dennis Kubes <nu...@dragonflymc.com>.
Although it is kinda hacking the system you may be able to do it in the 
map method by writing a custom MapRunner and having an object that lives 
in the MapRunner but that you set into each mapper instance.

Dennis

HUYLEBROECK Jeremy RD-ILAB-SSF wrote:
> I currently have a MR task that reads a SequenceFile via the map method
> to output some data.
>
> My goal is to output some data to MySQL but I'd like to read several
> records before doing the INSERT.
>
> But I can't figure out how to get several records...
> They all have different keys so the reduce task only gets one at a time.
>
>
> Thanks for any help!
>
>