You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Rakhi Khatwani <ra...@gmail.com> on 2009/04/21 12:19:29 UTC
Bulk read in a single map task.
Hi,
I have a scanario,
i have a table... which has 2 be read into say 'n' maps.
so now in each map... i need 2 access say 'm' records at once... so
that i can spawn them using threads.. to increase parallel processing.
is it feasible??? i am using hadoop 0.19.0 and hbase 0.19.0
Thanks
Raakhi
Re: Bulk read in a single map task.
Posted by Rakhi Khatwani <ra...@gmail.com>.
Thanks Stack
will try that tomorrow.
Regards,
Raakhi
On Wed, Apr 22, 2009 at 10:33 PM, stack <st...@duboce.net> wrote:
> On Wed, Apr 22, 2009 at 9:53 AM, Rakhi Khatwani <rakhi.khatwani@gmail.com
> >wrote:
>
> > Hi Stack,
> > In the traditional scenario, an InputSplit is given to the
> map
> > and the map iterates through each of them sequentially right.
> > is there any way in which i can have 5 (for example) records in each map
> > iteration??
>
>
>
> If you can't change your database schema so rows have all you need per map,
> or if the Scanner.next(int count) won't work for you -- i.e. get 'count'
> items on each next invocation (perhaps this will work, I don't know, just
> saw it in Interface), then you might want to play w/
> org.apache.hadoop.mapred.MapRunner. Its the thing that invokes maps. You
> can subclass it and grab a bunch of rows and feed them all in a lump to an
> amended map.
>
> St.Ack
>
Re: Bulk read in a single map task.
Posted by stack <st...@duboce.net>.
On Wed, Apr 22, 2009 at 9:53 AM, Rakhi Khatwani <ra...@gmail.com>wrote:
> Hi Stack,
> In the traditional scenario, an InputSplit is given to the map
> and the map iterates through each of them sequentially right.
> is there any way in which i can have 5 (for example) records in each map
> iteration??
If you can't change your database schema so rows have all you need per map,
or if the Scanner.next(int count) won't work for you -- i.e. get 'count'
items on each next invocation (perhaps this will work, I don't know, just
saw it in Interface), then you might want to play w/
org.apache.hadoop.mapred.MapRunner. Its the thing that invokes maps. You
can subclass it and grab a bunch of rows and feed them all in a lump to an
amended map.
St.Ack
Re: Bulk read in a single map task.
Posted by Rakhi Khatwani <ra...@gmail.com>.
Hi Stack,
In the traditional scenario, an InputSplit is given to the map
and the map iterates through each of them sequentially right.
is there any way in which i can have 5 (for example) records in each map
iteration??
Sorry for not being very clear last time.
Thanks,
Rakhi
On Wed, Apr 22, 2009 at 10:13 PM, stack <st...@duboce.net> wrote:
> Sorry. I'm having trouble following your question below. Want to have
> another go at it?
> Thanks,
> St.Ack
>
> On Tue, Apr 21, 2009 at 3:19 AM, Rakhi Khatwani <rakhi.khatwani@gmail.com
> >wrote:
>
> > Hi,
> > I have a scanario,
> > i have a table... which has 2 be read into say 'n' maps.
> > so now in each map... i need 2 access say 'm' records at once... so
> > that i can spawn them using threads.. to increase parallel processing.
> > is it feasible??? i am using hadoop 0.19.0 and hbase 0.19.0
> >
> > Thanks
> > Raakhi
> >
>
Re: Bulk read in a single map task.
Posted by stack <st...@duboce.net>.
Sorry. I'm having trouble following your question below. Want to have
another go at it?
Thanks,
St.Ack
On Tue, Apr 21, 2009 at 3:19 AM, Rakhi Khatwani <ra...@gmail.com>wrote:
> Hi,
> I have a scanario,
> i have a table... which has 2 be read into say 'n' maps.
> so now in each map... i need 2 access say 'm' records at once... so
> that i can spawn them using threads.. to increase parallel processing.
> is it feasible??? i am using hadoop 0.19.0 and hbase 0.19.0
>
> Thanks
> Raakhi
>