You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Niels Basjes <Ni...@basjes.nl> on 2015/02/26 09:26:58 UTC
Tuning the number of mappers when using HBaseStorage
Hi,
I wrote a pig script that uses the HBaseStorage to read data from HBase (I
do a full table scan for this usecase). Now this table only has 7 regions
right now and I found that the actual pig job runs with 7 mappers to read
this data.
I increased the number of splits in this table and the number of mappers
increased to the same value.
Now increasing the number of splits 'just' to get more mappers seems a bit
strange.
Question: Is it possible to run HBaseStorage as input and use more mappers
than there are regions? Or perhaps I can explicitly state the key ranges
per mapper (i.e. ignoring the actual regions in the table)?
--
Best regards / Met vriendelijke groeten,
Niels Basjes