You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Narendra yadala <na...@gmail.com> on 2012/04/19 12:43:07 UTC

HBase parallel scanner performance

I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop (4*32
GB RAM and 4*6 TB disk space) cluster. We are using Cloudera distribution
for maintaining our cluster. I have a single tweets table in which we store
the tweets, one tweet per row (it has millions of rows currently).

Now I try to run a Java batch (not a map reduce) which does the following :

   1. Open a scanner over the tweet table and read the tweets one after
   another. I set scanner caching to 128 rows as higher scanner caching is
   leading to ScannerTimeoutExceptions. I scan over the first 10k rows only.
   2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are there
   in that tweet and open another scanner over the tweets table to see who
   else shared that link. This involves getting rows having that URL from the
   entire table (not first 10k rows).
   3. Do similar stuff as in step 2 for hashtags
   (hashtagcolfamily:hashtagvalue).
   4. Do steps 1-3 in parallel for approximately 7-8 threads. This number
   can be higher (thousands also) later.


When I run this batch I got the GC issue which is specified here
http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
Then I tried to turn on the MSLAB feature and changed the GC settings by
specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
Even after doing this, I am running into all kinds of IOExceptions
and SocketTimeoutExceptions.

This Java batch opens approximately 7*2 (14) scanners open at a point in
time and still I am running into all kinds of troubles. I am wondering
whether I can have thousands of parallel scanners with HBase when I need to
scale.

It would be great to know whether I can open thousands/millions of scanners
in parallel with HBase efficiently.

Thanks
Narendra

RE: HBase parallel scanner performance

Posted by Bijieshan <bi...@huawei.com>.
Narendra,

Since I didn't see the client logs , FullGC is one probably reason I suspect. No matter it happens in client side or server side. So I suggest to check the GC log (Open the client GC log both at server and client side) to see whether FullGC happens with a high frequency, and check the stop-the-world pause time.

I don't think parallel scanners is the problem. 

Jieshan

-----Original Message-----
From: Narendra yadala [mailto:narendra.yadala@gmail.com] 
Sent: Thursday, April 19, 2012 11:24 PM
To: user@hbase.apache.org
Subject: Re: HBase parallel scanner performance

Hi Jieshan

HBase version : Version 0.90.4-cdh3u3
Size of Key Value pair should not be more than 2KB
I changed the GC parameters at the server side. I have not looked into GC
logs yet but I have noticed that it pausing the batch process every now and
then. How do I look at the server GC logs?

Thanks
Narendra

On Thu, Apr 19, 2012 at 7:46 PM, Bijieshan <bi...@huawei.com> wrote:

> Hi Narendra,
>
> I have a few doubts:
>
> 1. Which version you are using?
> 2. What's the size of each KeyValue?
> 3. Did you change the GC parameters in client side or server side? After
> changing the GC parameters, did you keep an eye on the GC logs?
>
> Thank you.
>
> Regards,
> Jieshan
>
> -----Original Message-----
> From: Narendra yadala [mailto:narendra.yadala@gmail.com]
> Sent: Thursday, April 19, 2012 8:04 PM
> To: user@hbase.apache.org
> Subject: Re: HBase parallel scanner performance
>
> Hi Michel
>
> Yes, that is exactly what I do in step 2. I am aware of the reason for the
> scanner timeout exceptions. It is the time between two consecutive
> invocations of the next call on a specific scanner object. I increased the
> scanner timeout to 10 min on the region server and still I keep seeing the
> timeouts. So I reduced my scanner cache to 128.
>
> Full table scan takes 130 seconds and there are 2.2 million rows in the
> table as of now. Each row is around 2 KB in size. I measured time for the
> full table scan by issuing `count` command from the hbase shell.
>
> I kind of understood the fix that you are specifying, but do I need to
> change the table structure to fix this problem? All I do is a n^2 operation
> and even that fails with 10 different types of exceptions. It is mildly
> annoying that I need to know all the low level storage details of HBase to
> do such a simple operation. And this is happening for just 14 parallel
> scanners. I am wondering what would happen when there are thousands of
> parallel scanners.
>
> Please let me know if there is any configuration param change which would
> fix this issue.
>
> Thanks a lot
> Narendra
>
> On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <michael_segel@hotmail.com
> >wrote:
>
> > So in your step 2 you have the following:
> > FOREACH row IN TABLE alpha:
> >     SELECT something
> >     FROM TABLE alpha
> >     WHERE alpha.url = row.url
> >
> > Right?
> > And you are wondering why you are getting timeouts?
> > ...
> > ...
> > And how long does it take to do a full table scan? ;-)
> > (there's more, but that's the first thing you should see...)
> >
> > Try creating a second table where you invert the URL and key pair such
> > that for each URL, you have a set of your alpha table's keys?
> >
> > Then you have the following...
> > FOREACH row IN TABLE alpha:
> >   FETCH key-set FROM beta
> >   WHERE beta.rowkey = alpha.url
> >
> > Note I use FETCH to signify that you should get a single row in response.
> >
> > Does this make sense?
> > ( your second table is actually and index of the URL column in your first
> > table)
> >
> > HTH
> >
> > Sent from a remote device. Please excuse any typos...
> >
> > Mike Segel
> >
> > On Apr 19, 2012, at 5:43 AM, Narendra yadala <na...@gmail.com>
> > wrote:
> >
> > > I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop
> > (4*32
> > > GB RAM and 4*6 TB disk space) cluster. We are using Cloudera
> distribution
> > > for maintaining our cluster. I have a single tweets table in which we
> > store
> > > the tweets, one tweet per row (it has millions of rows currently).
> > >
> > > Now I try to run a Java batch (not a map reduce) which does the
> > following :
> > >
> > >   1. Open a scanner over the tweet table and read the tweets one after
> > >   another. I set scanner caching to 128 rows as higher scanner caching
> is
> > >   leading to ScannerTimeoutExceptions. I scan over the first 10k rows
> > only.
> > >   2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are
> there
> > >   in that tweet and open another scanner over the tweets table to see
> who
> > >   else shared that link. This involves getting rows having that URL
> from
> > the
> > >   entire table (not first 10k rows).
> > >   3. Do similar stuff as in step 2 for hashtags
> > >   (hashtagcolfamily:hashtagvalue).
> > >   4. Do steps 1-3 in parallel for approximately 7-8 threads. This
> number
> > >   can be higher (thousands also) later.
> > >
> > >
> > > When I run this batch I got the GC issue which is specified here
> > >
> >
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
> > > Then I tried to turn on the MSLAB feature and changed the GC settings
> by
> > > specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
> > > Even after doing this, I am running into all kinds of IOExceptions
> > > and SocketTimeoutExceptions.
> > >
> > > This Java batch opens approximately 7*2 (14) scanners open at a point
> in
> > > time and still I am running into all kinds of troubles. I am wondering
> > > whether I can have thousands of parallel scanners with HBase when I
> need
> > to
> > > scale.
> > >
> > > It would be great to know whether I can open thousands/millions of
> > scanners
> > > in parallel with HBase efficiently.
> > >
> > > Thanks
> > > Narendra
> >
>

Re: HBase parallel scanner performance

Posted by Narendra yadala <na...@gmail.com>.
Hi Jieshan

HBase version : Version 0.90.4-cdh3u3
Size of Key Value pair should not be more than 2KB
I changed the GC parameters at the server side. I have not looked into GC
logs yet but I have noticed that it pausing the batch process every now and
then. How do I look at the server GC logs?

Thanks
Narendra

On Thu, Apr 19, 2012 at 7:46 PM, Bijieshan <bi...@huawei.com> wrote:

> Hi Narendra,
>
> I have a few doubts:
>
> 1. Which version you are using?
> 2. What's the size of each KeyValue?
> 3. Did you change the GC parameters in client side or server side? After
> changing the GC parameters, did you keep an eye on the GC logs?
>
> Thank you.
>
> Regards,
> Jieshan
>
> -----Original Message-----
> From: Narendra yadala [mailto:narendra.yadala@gmail.com]
> Sent: Thursday, April 19, 2012 8:04 PM
> To: user@hbase.apache.org
> Subject: Re: HBase parallel scanner performance
>
> Hi Michel
>
> Yes, that is exactly what I do in step 2. I am aware of the reason for the
> scanner timeout exceptions. It is the time between two consecutive
> invocations of the next call on a specific scanner object. I increased the
> scanner timeout to 10 min on the region server and still I keep seeing the
> timeouts. So I reduced my scanner cache to 128.
>
> Full table scan takes 130 seconds and there are 2.2 million rows in the
> table as of now. Each row is around 2 KB in size. I measured time for the
> full table scan by issuing `count` command from the hbase shell.
>
> I kind of understood the fix that you are specifying, but do I need to
> change the table structure to fix this problem? All I do is a n^2 operation
> and even that fails with 10 different types of exceptions. It is mildly
> annoying that I need to know all the low level storage details of HBase to
> do such a simple operation. And this is happening for just 14 parallel
> scanners. I am wondering what would happen when there are thousands of
> parallel scanners.
>
> Please let me know if there is any configuration param change which would
> fix this issue.
>
> Thanks a lot
> Narendra
>
> On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <michael_segel@hotmail.com
> >wrote:
>
> > So in your step 2 you have the following:
> > FOREACH row IN TABLE alpha:
> >     SELECT something
> >     FROM TABLE alpha
> >     WHERE alpha.url = row.url
> >
> > Right?
> > And you are wondering why you are getting timeouts?
> > ...
> > ...
> > And how long does it take to do a full table scan? ;-)
> > (there's more, but that's the first thing you should see...)
> >
> > Try creating a second table where you invert the URL and key pair such
> > that for each URL, you have a set of your alpha table's keys?
> >
> > Then you have the following...
> > FOREACH row IN TABLE alpha:
> >   FETCH key-set FROM beta
> >   WHERE beta.rowkey = alpha.url
> >
> > Note I use FETCH to signify that you should get a single row in response.
> >
> > Does this make sense?
> > ( your second table is actually and index of the URL column in your first
> > table)
> >
> > HTH
> >
> > Sent from a remote device. Please excuse any typos...
> >
> > Mike Segel
> >
> > On Apr 19, 2012, at 5:43 AM, Narendra yadala <na...@gmail.com>
> > wrote:
> >
> > > I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop
> > (4*32
> > > GB RAM and 4*6 TB disk space) cluster. We are using Cloudera
> distribution
> > > for maintaining our cluster. I have a single tweets table in which we
> > store
> > > the tweets, one tweet per row (it has millions of rows currently).
> > >
> > > Now I try to run a Java batch (not a map reduce) which does the
> > following :
> > >
> > >   1. Open a scanner over the tweet table and read the tweets one after
> > >   another. I set scanner caching to 128 rows as higher scanner caching
> is
> > >   leading to ScannerTimeoutExceptions. I scan over the first 10k rows
> > only.
> > >   2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are
> there
> > >   in that tweet and open another scanner over the tweets table to see
> who
> > >   else shared that link. This involves getting rows having that URL
> from
> > the
> > >   entire table (not first 10k rows).
> > >   3. Do similar stuff as in step 2 for hashtags
> > >   (hashtagcolfamily:hashtagvalue).
> > >   4. Do steps 1-3 in parallel for approximately 7-8 threads. This
> number
> > >   can be higher (thousands also) later.
> > >
> > >
> > > When I run this batch I got the GC issue which is specified here
> > >
> >
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
> > > Then I tried to turn on the MSLAB feature and changed the GC settings
> by
> > > specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
> > > Even after doing this, I am running into all kinds of IOExceptions
> > > and SocketTimeoutExceptions.
> > >
> > > This Java batch opens approximately 7*2 (14) scanners open at a point
> in
> > > time and still I am running into all kinds of troubles. I am wondering
> > > whether I can have thousands of parallel scanners with HBase when I
> need
> > to
> > > scale.
> > >
> > > It would be great to know whether I can open thousands/millions of
> > scanners
> > > in parallel with HBase efficiently.
> > >
> > > Thanks
> > > Narendra
> >
>

RE: HBase parallel scanner performance

Posted by Bijieshan <bi...@huawei.com>.
Hi Narendra,

I have a few doubts:

1. Which version you are using?
2. What's the size of each KeyValue?
3. Did you change the GC parameters in client side or server side? After changing the GC parameters, did you keep an eye on the GC logs? 

Thank you.

Regards,
Jieshan

-----Original Message-----
From: Narendra yadala [mailto:narendra.yadala@gmail.com] 
Sent: Thursday, April 19, 2012 8:04 PM
To: user@hbase.apache.org
Subject: Re: HBase parallel scanner performance

Hi Michel

Yes, that is exactly what I do in step 2. I am aware of the reason for the
scanner timeout exceptions. It is the time between two consecutive
invocations of the next call on a specific scanner object. I increased the
scanner timeout to 10 min on the region server and still I keep seeing the
timeouts. So I reduced my scanner cache to 128.

Full table scan takes 130 seconds and there are 2.2 million rows in the
table as of now. Each row is around 2 KB in size. I measured time for the
full table scan by issuing `count` command from the hbase shell.

I kind of understood the fix that you are specifying, but do I need to
change the table structure to fix this problem? All I do is a n^2 operation
and even that fails with 10 different types of exceptions. It is mildly
annoying that I need to know all the low level storage details of HBase to
do such a simple operation. And this is happening for just 14 parallel
scanners. I am wondering what would happen when there are thousands of
parallel scanners.

Please let me know if there is any configuration param change which would
fix this issue.

Thanks a lot
Narendra

On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <mi...@hotmail.com>wrote:

> So in your step 2 you have the following:
> FOREACH row IN TABLE alpha:
>     SELECT something
>     FROM TABLE alpha
>     WHERE alpha.url = row.url
>
> Right?
> And you are wondering why you are getting timeouts?
> ...
> ...
> And how long does it take to do a full table scan? ;-)
> (there's more, but that's the first thing you should see...)
>
> Try creating a second table where you invert the URL and key pair such
> that for each URL, you have a set of your alpha table's keys?
>
> Then you have the following...
> FOREACH row IN TABLE alpha:
>   FETCH key-set FROM beta
>   WHERE beta.rowkey = alpha.url
>
> Note I use FETCH to signify that you should get a single row in response.
>
> Does this make sense?
> ( your second table is actually and index of the URL column in your first
> table)
>
> HTH
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Apr 19, 2012, at 5:43 AM, Narendra yadala <na...@gmail.com>
> wrote:
>
> > I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop
> (4*32
> > GB RAM and 4*6 TB disk space) cluster. We are using Cloudera distribution
> > for maintaining our cluster. I have a single tweets table in which we
> store
> > the tweets, one tweet per row (it has millions of rows currently).
> >
> > Now I try to run a Java batch (not a map reduce) which does the
> following :
> >
> >   1. Open a scanner over the tweet table and read the tweets one after
> >   another. I set scanner caching to 128 rows as higher scanner caching is
> >   leading to ScannerTimeoutExceptions. I scan over the first 10k rows
> only.
> >   2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are there
> >   in that tweet and open another scanner over the tweets table to see who
> >   else shared that link. This involves getting rows having that URL from
> the
> >   entire table (not first 10k rows).
> >   3. Do similar stuff as in step 2 for hashtags
> >   (hashtagcolfamily:hashtagvalue).
> >   4. Do steps 1-3 in parallel for approximately 7-8 threads. This number
> >   can be higher (thousands also) later.
> >
> >
> > When I run this batch I got the GC issue which is specified here
> >
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
> > Then I tried to turn on the MSLAB feature and changed the GC settings by
> > specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
> > Even after doing this, I am running into all kinds of IOExceptions
> > and SocketTimeoutExceptions.
> >
> > This Java batch opens approximately 7*2 (14) scanners open at a point in
> > time and still I am running into all kinds of troubles. I am wondering
> > whether I can have thousands of parallel scanners with HBase when I need
> to
> > scale.
> >
> > It would be great to know whether I can open thousands/millions of
> scanners
> > in parallel with HBase efficiently.
> >
> > Thanks
> > Narendra
>

Re: HBase parallel scanner performance

Posted by S Ahmed <sa...@gmail.com>.
great thread for a real world problem.

Michael, it sounds like the initial design was more of a traditional db
solution, whereas with hbase (and nosql in general) the design is to
denormalize and build your row/cf structure to fit the use case.  Disks are
cheap, writes are fast, so build your index in order to scan for the
results you need.

On Thu, Apr 19, 2012 at 2:33 PM, Michael Segel <mi...@hotmail.com>wrote:

> No problem.
>
> One of the hardest things to do is to try to be open to other design ideas
> and not become wedded to one.
>
> I think once you get that working you can start to look at your cluster.
>
> On Apr 19, 2012, at 1:26 PM, Narendra yadala wrote:
>
> > Michael,
> >
> > I will do the redesign and build the index. Thanks a lot for the
> insights.
> >
> > Narendra
> >
> > On Thu, Apr 19, 2012 at 9:56 PM, Michael Segel <
> michael_segel@hotmail.com>wrote:
> >
> >> Narendra,
> >>
> >> I think you are still missing the point.
> >> 130 seconds to scan the table per iteration.
> >> Even if you have 10K rows
> >> 130 * 10^4 or 1.3*10^6 seconds.  ~361 hours
> >>
> >> Compare that to 10K rows where you then select a single row in your sub
> >> select that has a list of all of the associated rows.
> >> You can then do  n number of get()s based on the data in the index. (If
> >> the data wasn't in the index itself)
> >>
> >> Assuming that the data was in the index, that's one get(). This is sub
> >> second.
> >> Just to keep things simple assume 1 second.
> >> That's 10K seconds vs 1.3 million seconds.  (2 hours vs 361hours)
> >> Actually its more like 10ms  so its 100 seconds to run your code.  (So
> its
> >> like 2 minutes or so)
> >>
> >> Also since you're doing less work, you put less strain on the system.
> >>
> >> Look, you're asking for help. You're fighting to maintain a bad design.
> >> Building the index table shouldn't take you more than a day to think,
> >> design and implement.
> >>
> >> So you tell me, 2 minutes vs 361 hours. Which would you choose?
> >>
> >> HTH
> >>
> >> -Mike
> >>
> >>
> >> On Apr 19, 2012, at 10:04 AM, Narendra yadala wrote:
> >>
> >>> Michael,
> >>>
> >>> Thanks for the response. This is a real problem and not a class
> project.
> >>> Boxes itself costed 9k ;)
> >>>
> >>> I think there is some difference in understanding of the problem. The
> >> table
> >>> has 2m rows but I am looking at the latest 10k rows only in the outer
> for
> >>> loop. Only in the inner for loop i am trying to get all rows that
> contain
> >>> the url that is given by the row in the outer for loop. So pseudo code
> is
> >>> like this
> >>>
> >>> All scanners have a caching of 128.
> >>>
> >>> Scanner outerScanner =  tweetTable.getScanner(new Scan()); //This gets
> >> the
> >>> entire row
> >>> for (int index = 0; index < 10000; index++) {
> >>> Result tweet =  outerScanner.next();
> >>> NavigableMap<byte[],byte[]> linkFamilyMap =
> >>> tweet.getFamilyMap(Bytes.toBytes("link"));
> >>> String url = Bytes.toString( linkFamilyMap.firstKey());  //assuming
> only
> >>> one link is there in the tweet.
> >>> Scan linkScan = new Scan();
> >>> linkScan.addColumn(Bytes.toBytes("link"), Bytes.toBytes(url)); //get
> only
> >>> the link column family
> >>> Scanner linkScanner = tweetTable.getScanner(linkScan); //ideally this
> for
> >>> loop is taking 2 sec per sc
> >>> for (Result linkResult = linkScanner.next(); linkResult != null;
> >>> linkResult = linkScanner.next()) {
> >>>   //do something with the link
> >>> }
> >>> linkScanner.close();
> >>>
> >>>       //do a similar for loop for hashtags
> >>> }
> >>>
> >>> Each of my inner for loop is taking around 20 seconds (or more
> depending
> >> on
> >>> number of rows returned by that particular scanner) for each of the 10k
> >>> rows that I am processing and this is also triggering a lot of GC in
> >> turn.
> >>> So it is 10000*40 seconds (4 days) for each thread. But the problem is
> >> that
> >>> the batch process crashes before completion throwing IOException and
> >>> SocketTimeoutException and sometimes GC OutOfMemory exceptions.
> >>>
> >>> I will definitely take the much elegant approach that you mentioned
> >>> eventually. I just wanted to get to the core of the issue before
> choosing
> >>> the solution.
> >>>
> >>> Thanks again.
> >>> Narendra
> >>>
> >>> On Thu, Apr 19, 2012 at 7:42 PM, Michel Segel <
> michael_segel@hotmail.com
> >>> wrote:
> >>>
> >>>> Narendra,
> >>>>
> >>>> Are you trying to solve a real problem, or is this a class project?
> >>>>
> >>>> Your solution doesn't scale. It's a non starter. 130 seconds for each
> >>>> iteration times 1 million seconds is how long? 130 million seconds,
> >> which
> >>>> is ~36000 hours or over 4 years to complete.
> >>>> (the numbers are rough but you get the idea...)
> >>>>
> >>>> That's assuming that your table is static and doesn't change.
> >>>>
> >>>> I didn't even ask if you were attempting any sort of server side
> >> filtering
> >>>> which would reduce the amount of data you send back to the client
> >> because
> >>>> it a moot point.
> >>>>
> >>>> Finer tuning is also moot.
> >>>>
> >>>> So you insert a row in one table. You then do n^2 operations to pull
> out
> >>>> data.
> >>>> The better solution is to insert data into 2 tables where you then
> have
> >> to
> >>>> do 2n operations to get the same results. Thats per thread btw.  So if
> >> you
> >>>> were running 10 threads, you would have 10n^2  operations versus 20n
> >>>> operations to get the same result set.
> >>>>
> >>>> A million row table... 1*10^13. Vs 2*10^6
> >>>>
> >>>> I don't believe I mentioned anything about HBase's internals and this
> >>>> solution works for any NoSQL database.
> >>>>
> >>>>
> >>>> Sent from a remote device. Please excuse any typos...
> >>>>
> >>>> Mike Segel
> >>>>
> >>>> On Apr 19, 2012, at 7:03 AM, Narendra yadala <
> narendra.yadala@gmail.com
> >>>
> >>>> wrote:
> >>>>
> >>>>> Hi Michel
> >>>>>
> >>>>> Yes, that is exactly what I do in step 2. I am aware of the reason
> for
> >>>> the
> >>>>> scanner timeout exceptions. It is the time between two consecutive
> >>>>> invocations of the next call on a specific scanner object. I
> increased
> >>>> the
> >>>>> scanner timeout to 10 min on the region server and still I keep
> seeing
> >>>> the
> >>>>> timeouts. So I reduced my scanner cache to 128.
> >>>>>
> >>>>> Full table scan takes 130 seconds and there are 2.2 million rows in
> the
> >>>>> table as of now. Each row is around 2 KB in size. I measured time for
> >> the
> >>>>> full table scan by issuing `count` command from the hbase shell.
> >>>>>
> >>>>> I kind of understood the fix that you are specifying, but do I need
> to
> >>>>> change the table structure to fix this problem? All I do is a n^2
> >>>> operation
> >>>>> and even that fails with 10 different types of exceptions. It is
> mildly
> >>>>> annoying that I need to know all the low level storage details of
> HBase
> >>>> to
> >>>>> do such a simple operation. And this is happening for just 14
> parallel
> >>>>> scanners. I am wondering what would happen when there are thousands
> of
> >>>>> parallel scanners.
> >>>>>
> >>>>> Please let me know if there is any configuration param change which
> >> would
> >>>>> fix this issue.
> >>>>>
> >>>>> Thanks a lot
> >>>>> Narendra
> >>>>>
> >>>>> On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <
> >> michael_segel@hotmail.com
> >>>>> wrote:
> >>>>>
> >>>>>> So in your step 2 you have the following:
> >>>>>> FOREACH row IN TABLE alpha:
> >>>>>>  SELECT something
> >>>>>>  FROM TABLE alpha
> >>>>>>  WHERE alpha.url = row.url
> >>>>>>
> >>>>>> Right?
> >>>>>> And you are wondering why you are getting timeouts?
> >>>>>> ...
> >>>>>> ...
> >>>>>> And how long does it take to do a full table scan? ;-)
> >>>>>> (there's more, but that's the first thing you should see...)
> >>>>>>
> >>>>>> Try creating a second table where you invert the URL and key pair
> such
> >>>>>> that for each URL, you have a set of your alpha table's keys?
> >>>>>>
> >>>>>> Then you have the following...
> >>>>>> FOREACH row IN TABLE alpha:
> >>>>>> FETCH key-set FROM beta
> >>>>>> WHERE beta.rowkey = alpha.url
> >>>>>>
> >>>>>> Note I use FETCH to signify that you should get a single row in
> >>>> response.
> >>>>>>
> >>>>>> Does this make sense?
> >>>>>> ( your second table is actually and index of the URL column in your
> >>>> first
> >>>>>> table)
> >>>>>>
> >>>>>> HTH
> >>>>>>
> >>>>>> Sent from a remote device. Please excuse any typos...
> >>>>>>
> >>>>>> Mike Segel
> >>>>>>
> >>>>>> On Apr 19, 2012, at 5:43 AM, Narendra yadala <
> >> narendra.yadala@gmail.com
> >>>>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> I have an issue with my HBase cluster. We have a 4 node
> HBase/Hadoop
> >>>>>> (4*32
> >>>>>>> GB RAM and 4*6 TB disk space) cluster. We are using Cloudera
> >>>> distribution
> >>>>>>> for maintaining our cluster. I have a single tweets table in which
> we
> >>>>>> store
> >>>>>>> the tweets, one tweet per row (it has millions of rows currently).
> >>>>>>>
> >>>>>>> Now I try to run a Java batch (not a map reduce) which does the
> >>>>>> following :
> >>>>>>>
> >>>>>>> 1. Open a scanner over the tweet table and read the tweets one
> after
> >>>>>>> another. I set scanner caching to 128 rows as higher scanner
> caching
> >>>> is
> >>>>>>> leading to ScannerTimeoutExceptions. I scan over the first 10k rows
> >>>>>> only.
> >>>>>>> 2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are
> >>>> there
> >>>>>>> in that tweet and open another scanner over the tweets table to see
> >>>> who
> >>>>>>> else shared that link. This involves getting rows having that URL
> >> from
> >>>>>> the
> >>>>>>> entire table (not first 10k rows).
> >>>>>>> 3. Do similar stuff as in step 2 for hashtags
> >>>>>>> (hashtagcolfamily:hashtagvalue).
> >>>>>>> 4. Do steps 1-3 in parallel for approximately 7-8 threads. This
> >> number
> >>>>>>> can be higher (thousands also) later.
> >>>>>>>
> >>>>>>>
> >>>>>>> When I run this batch I got the GC issue which is specified here
> >>>>>>>
> >>>>>>
> >>>>
> >>
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
> >>>>>>> Then I tried to turn on the MSLAB feature and changed the GC
> settings
> >>>> by
> >>>>>>> specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM
> flags.
> >>>>>>> Even after doing this, I am running into all kinds of IOExceptions
> >>>>>>> and SocketTimeoutExceptions.
> >>>>>>>
> >>>>>>> This Java batch opens approximately 7*2 (14) scanners open at a
> point
> >>>> in
> >>>>>>> time and still I am running into all kinds of troubles. I am
> >> wondering
> >>>>>>> whether I can have thousands of parallel scanners with HBase when I
> >>>> need
> >>>>>> to
> >>>>>>> scale.
> >>>>>>>
> >>>>>>> It would be great to know whether I can open thousands/millions of
> >>>>>> scanners
> >>>>>>> in parallel with HBase efficiently.
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>> Narendra
> >>>>>>
> >>>>
> >>
> >>
>
>

Re: HBase parallel scanner performance

Posted by Michael Segel <mi...@hotmail.com>.
No problem. 

One of the hardest things to do is to try to be open to other design ideas and not become wedded to one.

I think once you get that working you can start to look at your cluster. 

On Apr 19, 2012, at 1:26 PM, Narendra yadala wrote:

> Michael,
> 
> I will do the redesign and build the index. Thanks a lot for the insights.
> 
> Narendra
> 
> On Thu, Apr 19, 2012 at 9:56 PM, Michael Segel <mi...@hotmail.com>wrote:
> 
>> Narendra,
>> 
>> I think you are still missing the point.
>> 130 seconds to scan the table per iteration.
>> Even if you have 10K rows
>> 130 * 10^4 or 1.3*10^6 seconds.  ~361 hours
>> 
>> Compare that to 10K rows where you then select a single row in your sub
>> select that has a list of all of the associated rows.
>> You can then do  n number of get()s based on the data in the index. (If
>> the data wasn't in the index itself)
>> 
>> Assuming that the data was in the index, that's one get(). This is sub
>> second.
>> Just to keep things simple assume 1 second.
>> That's 10K seconds vs 1.3 million seconds.  (2 hours vs 361hours)
>> Actually its more like 10ms  so its 100 seconds to run your code.  (So its
>> like 2 minutes or so)
>> 
>> Also since you're doing less work, you put less strain on the system.
>> 
>> Look, you're asking for help. You're fighting to maintain a bad design.
>> Building the index table shouldn't take you more than a day to think,
>> design and implement.
>> 
>> So you tell me, 2 minutes vs 361 hours. Which would you choose?
>> 
>> HTH
>> 
>> -Mike
>> 
>> 
>> On Apr 19, 2012, at 10:04 AM, Narendra yadala wrote:
>> 
>>> Michael,
>>> 
>>> Thanks for the response. This is a real problem and not a class project.
>>> Boxes itself costed 9k ;)
>>> 
>>> I think there is some difference in understanding of the problem. The
>> table
>>> has 2m rows but I am looking at the latest 10k rows only in the outer for
>>> loop. Only in the inner for loop i am trying to get all rows that contain
>>> the url that is given by the row in the outer for loop. So pseudo code is
>>> like this
>>> 
>>> All scanners have a caching of 128.
>>> 
>>> Scanner outerScanner =  tweetTable.getScanner(new Scan()); //This gets
>> the
>>> entire row
>>> for (int index = 0; index < 10000; index++) {
>>> Result tweet =  outerScanner.next();
>>> NavigableMap<byte[],byte[]> linkFamilyMap =
>>> tweet.getFamilyMap(Bytes.toBytes("link"));
>>> String url = Bytes.toString( linkFamilyMap.firstKey());  //assuming only
>>> one link is there in the tweet.
>>> Scan linkScan = new Scan();
>>> linkScan.addColumn(Bytes.toBytes("link"), Bytes.toBytes(url)); //get only
>>> the link column family
>>> Scanner linkScanner = tweetTable.getScanner(linkScan); //ideally this for
>>> loop is taking 2 sec per sc
>>> for (Result linkResult = linkScanner.next(); linkResult != null;
>>> linkResult = linkScanner.next()) {
>>>   //do something with the link
>>> }
>>> linkScanner.close();
>>> 
>>>       //do a similar for loop for hashtags
>>> }
>>> 
>>> Each of my inner for loop is taking around 20 seconds (or more depending
>> on
>>> number of rows returned by that particular scanner) for each of the 10k
>>> rows that I am processing and this is also triggering a lot of GC in
>> turn.
>>> So it is 10000*40 seconds (4 days) for each thread. But the problem is
>> that
>>> the batch process crashes before completion throwing IOException and
>>> SocketTimeoutException and sometimes GC OutOfMemory exceptions.
>>> 
>>> I will definitely take the much elegant approach that you mentioned
>>> eventually. I just wanted to get to the core of the issue before choosing
>>> the solution.
>>> 
>>> Thanks again.
>>> Narendra
>>> 
>>> On Thu, Apr 19, 2012 at 7:42 PM, Michel Segel <michael_segel@hotmail.com
>>> wrote:
>>> 
>>>> Narendra,
>>>> 
>>>> Are you trying to solve a real problem, or is this a class project?
>>>> 
>>>> Your solution doesn't scale. It's a non starter. 130 seconds for each
>>>> iteration times 1 million seconds is how long? 130 million seconds,
>> which
>>>> is ~36000 hours or over 4 years to complete.
>>>> (the numbers are rough but you get the idea...)
>>>> 
>>>> That's assuming that your table is static and doesn't change.
>>>> 
>>>> I didn't even ask if you were attempting any sort of server side
>> filtering
>>>> which would reduce the amount of data you send back to the client
>> because
>>>> it a moot point.
>>>> 
>>>> Finer tuning is also moot.
>>>> 
>>>> So you insert a row in one table. You then do n^2 operations to pull out
>>>> data.
>>>> The better solution is to insert data into 2 tables where you then have
>> to
>>>> do 2n operations to get the same results. Thats per thread btw.  So if
>> you
>>>> were running 10 threads, you would have 10n^2  operations versus 20n
>>>> operations to get the same result set.
>>>> 
>>>> A million row table... 1*10^13. Vs 2*10^6
>>>> 
>>>> I don't believe I mentioned anything about HBase's internals and this
>>>> solution works for any NoSQL database.
>>>> 
>>>> 
>>>> Sent from a remote device. Please excuse any typos...
>>>> 
>>>> Mike Segel
>>>> 
>>>> On Apr 19, 2012, at 7:03 AM, Narendra yadala <narendra.yadala@gmail.com
>>> 
>>>> wrote:
>>>> 
>>>>> Hi Michel
>>>>> 
>>>>> Yes, that is exactly what I do in step 2. I am aware of the reason for
>>>> the
>>>>> scanner timeout exceptions. It is the time between two consecutive
>>>>> invocations of the next call on a specific scanner object. I increased
>>>> the
>>>>> scanner timeout to 10 min on the region server and still I keep seeing
>>>> the
>>>>> timeouts. So I reduced my scanner cache to 128.
>>>>> 
>>>>> Full table scan takes 130 seconds and there are 2.2 million rows in the
>>>>> table as of now. Each row is around 2 KB in size. I measured time for
>> the
>>>>> full table scan by issuing `count` command from the hbase shell.
>>>>> 
>>>>> I kind of understood the fix that you are specifying, but do I need to
>>>>> change the table structure to fix this problem? All I do is a n^2
>>>> operation
>>>>> and even that fails with 10 different types of exceptions. It is mildly
>>>>> annoying that I need to know all the low level storage details of HBase
>>>> to
>>>>> do such a simple operation. And this is happening for just 14 parallel
>>>>> scanners. I am wondering what would happen when there are thousands of
>>>>> parallel scanners.
>>>>> 
>>>>> Please let me know if there is any configuration param change which
>> would
>>>>> fix this issue.
>>>>> 
>>>>> Thanks a lot
>>>>> Narendra
>>>>> 
>>>>> On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <
>> michael_segel@hotmail.com
>>>>> wrote:
>>>>> 
>>>>>> So in your step 2 you have the following:
>>>>>> FOREACH row IN TABLE alpha:
>>>>>>  SELECT something
>>>>>>  FROM TABLE alpha
>>>>>>  WHERE alpha.url = row.url
>>>>>> 
>>>>>> Right?
>>>>>> And you are wondering why you are getting timeouts?
>>>>>> ...
>>>>>> ...
>>>>>> And how long does it take to do a full table scan? ;-)
>>>>>> (there's more, but that's the first thing you should see...)
>>>>>> 
>>>>>> Try creating a second table where you invert the URL and key pair such
>>>>>> that for each URL, you have a set of your alpha table's keys?
>>>>>> 
>>>>>> Then you have the following...
>>>>>> FOREACH row IN TABLE alpha:
>>>>>> FETCH key-set FROM beta
>>>>>> WHERE beta.rowkey = alpha.url
>>>>>> 
>>>>>> Note I use FETCH to signify that you should get a single row in
>>>> response.
>>>>>> 
>>>>>> Does this make sense?
>>>>>> ( your second table is actually and index of the URL column in your
>>>> first
>>>>>> table)
>>>>>> 
>>>>>> HTH
>>>>>> 
>>>>>> Sent from a remote device. Please excuse any typos...
>>>>>> 
>>>>>> Mike Segel
>>>>>> 
>>>>>> On Apr 19, 2012, at 5:43 AM, Narendra yadala <
>> narendra.yadala@gmail.com
>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop
>>>>>> (4*32
>>>>>>> GB RAM and 4*6 TB disk space) cluster. We are using Cloudera
>>>> distribution
>>>>>>> for maintaining our cluster. I have a single tweets table in which we
>>>>>> store
>>>>>>> the tweets, one tweet per row (it has millions of rows currently).
>>>>>>> 
>>>>>>> Now I try to run a Java batch (not a map reduce) which does the
>>>>>> following :
>>>>>>> 
>>>>>>> 1. Open a scanner over the tweet table and read the tweets one after
>>>>>>> another. I set scanner caching to 128 rows as higher scanner caching
>>>> is
>>>>>>> leading to ScannerTimeoutExceptions. I scan over the first 10k rows
>>>>>> only.
>>>>>>> 2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are
>>>> there
>>>>>>> in that tweet and open another scanner over the tweets table to see
>>>> who
>>>>>>> else shared that link. This involves getting rows having that URL
>> from
>>>>>> the
>>>>>>> entire table (not first 10k rows).
>>>>>>> 3. Do similar stuff as in step 2 for hashtags
>>>>>>> (hashtagcolfamily:hashtagvalue).
>>>>>>> 4. Do steps 1-3 in parallel for approximately 7-8 threads. This
>> number
>>>>>>> can be higher (thousands also) later.
>>>>>>> 
>>>>>>> 
>>>>>>> When I run this batch I got the GC issue which is specified here
>>>>>>> 
>>>>>> 
>>>> 
>> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
>>>>>>> Then I tried to turn on the MSLAB feature and changed the GC settings
>>>> by
>>>>>>> specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
>>>>>>> Even after doing this, I am running into all kinds of IOExceptions
>>>>>>> and SocketTimeoutExceptions.
>>>>>>> 
>>>>>>> This Java batch opens approximately 7*2 (14) scanners open at a point
>>>> in
>>>>>>> time and still I am running into all kinds of troubles. I am
>> wondering
>>>>>>> whether I can have thousands of parallel scanners with HBase when I
>>>> need
>>>>>> to
>>>>>>> scale.
>>>>>>> 
>>>>>>> It would be great to know whether I can open thousands/millions of
>>>>>> scanners
>>>>>>> in parallel with HBase efficiently.
>>>>>>> 
>>>>>>> Thanks
>>>>>>> Narendra
>>>>>> 
>>>> 
>> 
>> 


Re: HBase parallel scanner performance

Posted by Narendra yadala <na...@gmail.com>.
Michael,

I will do the redesign and build the index. Thanks a lot for the insights.

Narendra

On Thu, Apr 19, 2012 at 9:56 PM, Michael Segel <mi...@hotmail.com>wrote:

> Narendra,
>
> I think you are still missing the point.
> 130 seconds to scan the table per iteration.
> Even if you have 10K rows
> 130 * 10^4 or 1.3*10^6 seconds.  ~361 hours
>
> Compare that to 10K rows where you then select a single row in your sub
> select that has a list of all of the associated rows.
> You can then do  n number of get()s based on the data in the index. (If
> the data wasn't in the index itself)
>
> Assuming that the data was in the index, that's one get(). This is sub
> second.
> Just to keep things simple assume 1 second.
> That's 10K seconds vs 1.3 million seconds.  (2 hours vs 361hours)
> Actually its more like 10ms  so its 100 seconds to run your code.  (So its
> like 2 minutes or so)
>
> Also since you're doing less work, you put less strain on the system.
>
> Look, you're asking for help. You're fighting to maintain a bad design.
> Building the index table shouldn't take you more than a day to think,
> design and implement.
>
> So you tell me, 2 minutes vs 361 hours. Which would you choose?
>
> HTH
>
> -Mike
>
>
> On Apr 19, 2012, at 10:04 AM, Narendra yadala wrote:
>
> > Michael,
> >
> > Thanks for the response. This is a real problem and not a class project.
> > Boxes itself costed 9k ;)
> >
> > I think there is some difference in understanding of the problem. The
> table
> > has 2m rows but I am looking at the latest 10k rows only in the outer for
> > loop. Only in the inner for loop i am trying to get all rows that contain
> > the url that is given by the row in the outer for loop. So pseudo code is
> > like this
> >
> > All scanners have a caching of 128.
> >
> > Scanner outerScanner =  tweetTable.getScanner(new Scan()); //This gets
> the
> > entire row
> > for (int index = 0; index < 10000; index++) {
> > Result tweet =  outerScanner.next();
> > NavigableMap<byte[],byte[]> linkFamilyMap =
> > tweet.getFamilyMap(Bytes.toBytes("link"));
> > String url = Bytes.toString( linkFamilyMap.firstKey());  //assuming only
> > one link is there in the tweet.
> > Scan linkScan = new Scan();
> > linkScan.addColumn(Bytes.toBytes("link"), Bytes.toBytes(url)); //get only
> > the link column family
> > Scanner linkScanner = tweetTable.getScanner(linkScan); //ideally this for
> > loop is taking 2 sec per sc
> > for (Result linkResult = linkScanner.next(); linkResult != null;
> > linkResult = linkScanner.next()) {
> >    //do something with the link
> > }
> > linkScanner.close();
> >
> >        //do a similar for loop for hashtags
> > }
> >
> > Each of my inner for loop is taking around 20 seconds (or more depending
> on
> > number of rows returned by that particular scanner) for each of the 10k
> > rows that I am processing and this is also triggering a lot of GC in
> turn.
> > So it is 10000*40 seconds (4 days) for each thread. But the problem is
> that
> > the batch process crashes before completion throwing IOException and
> > SocketTimeoutException and sometimes GC OutOfMemory exceptions.
> >
> > I will definitely take the much elegant approach that you mentioned
> > eventually. I just wanted to get to the core of the issue before choosing
> > the solution.
> >
> > Thanks again.
> > Narendra
> >
> > On Thu, Apr 19, 2012 at 7:42 PM, Michel Segel <michael_segel@hotmail.com
> >wrote:
> >
> >> Narendra,
> >>
> >> Are you trying to solve a real problem, or is this a class project?
> >>
> >> Your solution doesn't scale. It's a non starter. 130 seconds for each
> >> iteration times 1 million seconds is how long? 130 million seconds,
> which
> >> is ~36000 hours or over 4 years to complete.
> >> (the numbers are rough but you get the idea...)
> >>
> >> That's assuming that your table is static and doesn't change.
> >>
> >> I didn't even ask if you were attempting any sort of server side
> filtering
> >> which would reduce the amount of data you send back to the client
> because
> >> it a moot point.
> >>
> >> Finer tuning is also moot.
> >>
> >> So you insert a row in one table. You then do n^2 operations to pull out
> >> data.
> >> The better solution is to insert data into 2 tables where you then have
> to
> >> do 2n operations to get the same results. Thats per thread btw.  So if
> you
> >> were running 10 threads, you would have 10n^2  operations versus 20n
> >> operations to get the same result set.
> >>
> >> A million row table... 1*10^13. Vs 2*10^6
> >>
> >> I don't believe I mentioned anything about HBase's internals and this
> >> solution works for any NoSQL database.
> >>
> >>
> >> Sent from a remote device. Please excuse any typos...
> >>
> >> Mike Segel
> >>
> >> On Apr 19, 2012, at 7:03 AM, Narendra yadala <narendra.yadala@gmail.com
> >
> >> wrote:
> >>
> >>> Hi Michel
> >>>
> >>> Yes, that is exactly what I do in step 2. I am aware of the reason for
> >> the
> >>> scanner timeout exceptions. It is the time between two consecutive
> >>> invocations of the next call on a specific scanner object. I increased
> >> the
> >>> scanner timeout to 10 min on the region server and still I keep seeing
> >> the
> >>> timeouts. So I reduced my scanner cache to 128.
> >>>
> >>> Full table scan takes 130 seconds and there are 2.2 million rows in the
> >>> table as of now. Each row is around 2 KB in size. I measured time for
> the
> >>> full table scan by issuing `count` command from the hbase shell.
> >>>
> >>> I kind of understood the fix that you are specifying, but do I need to
> >>> change the table structure to fix this problem? All I do is a n^2
> >> operation
> >>> and even that fails with 10 different types of exceptions. It is mildly
> >>> annoying that I need to know all the low level storage details of HBase
> >> to
> >>> do such a simple operation. And this is happening for just 14 parallel
> >>> scanners. I am wondering what would happen when there are thousands of
> >>> parallel scanners.
> >>>
> >>> Please let me know if there is any configuration param change which
> would
> >>> fix this issue.
> >>>
> >>> Thanks a lot
> >>> Narendra
> >>>
> >>> On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <
> michael_segel@hotmail.com
> >>> wrote:
> >>>
> >>>> So in your step 2 you have the following:
> >>>> FOREACH row IN TABLE alpha:
> >>>>   SELECT something
> >>>>   FROM TABLE alpha
> >>>>   WHERE alpha.url = row.url
> >>>>
> >>>> Right?
> >>>> And you are wondering why you are getting timeouts?
> >>>> ...
> >>>> ...
> >>>> And how long does it take to do a full table scan? ;-)
> >>>> (there's more, but that's the first thing you should see...)
> >>>>
> >>>> Try creating a second table where you invert the URL and key pair such
> >>>> that for each URL, you have a set of your alpha table's keys?
> >>>>
> >>>> Then you have the following...
> >>>> FOREACH row IN TABLE alpha:
> >>>> FETCH key-set FROM beta
> >>>> WHERE beta.rowkey = alpha.url
> >>>>
> >>>> Note I use FETCH to signify that you should get a single row in
> >> response.
> >>>>
> >>>> Does this make sense?
> >>>> ( your second table is actually and index of the URL column in your
> >> first
> >>>> table)
> >>>>
> >>>> HTH
> >>>>
> >>>> Sent from a remote device. Please excuse any typos...
> >>>>
> >>>> Mike Segel
> >>>>
> >>>> On Apr 19, 2012, at 5:43 AM, Narendra yadala <
> narendra.yadala@gmail.com
> >>>
> >>>> wrote:
> >>>>
> >>>>> I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop
> >>>> (4*32
> >>>>> GB RAM and 4*6 TB disk space) cluster. We are using Cloudera
> >> distribution
> >>>>> for maintaining our cluster. I have a single tweets table in which we
> >>>> store
> >>>>> the tweets, one tweet per row (it has millions of rows currently).
> >>>>>
> >>>>> Now I try to run a Java batch (not a map reduce) which does the
> >>>> following :
> >>>>>
> >>>>> 1. Open a scanner over the tweet table and read the tweets one after
> >>>>> another. I set scanner caching to 128 rows as higher scanner caching
> >> is
> >>>>> leading to ScannerTimeoutExceptions. I scan over the first 10k rows
> >>>> only.
> >>>>> 2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are
> >> there
> >>>>> in that tweet and open another scanner over the tweets table to see
> >> who
> >>>>> else shared that link. This involves getting rows having that URL
> from
> >>>> the
> >>>>> entire table (not first 10k rows).
> >>>>> 3. Do similar stuff as in step 2 for hashtags
> >>>>> (hashtagcolfamily:hashtagvalue).
> >>>>> 4. Do steps 1-3 in parallel for approximately 7-8 threads. This
> number
> >>>>> can be higher (thousands also) later.
> >>>>>
> >>>>>
> >>>>> When I run this batch I got the GC issue which is specified here
> >>>>>
> >>>>
> >>
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
> >>>>> Then I tried to turn on the MSLAB feature and changed the GC settings
> >> by
> >>>>> specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
> >>>>> Even after doing this, I am running into all kinds of IOExceptions
> >>>>> and SocketTimeoutExceptions.
> >>>>>
> >>>>> This Java batch opens approximately 7*2 (14) scanners open at a point
> >> in
> >>>>> time and still I am running into all kinds of troubles. I am
> wondering
> >>>>> whether I can have thousands of parallel scanners with HBase when I
> >> need
> >>>> to
> >>>>> scale.
> >>>>>
> >>>>> It would be great to know whether I can open thousands/millions of
> >>>> scanners
> >>>>> in parallel with HBase efficiently.
> >>>>>
> >>>>> Thanks
> >>>>> Narendra
> >>>>
> >>
>
>

Re: HBase parallel scanner performance

Posted by Michael Segel <mi...@hotmail.com>.
Narendra, 

I think you are still missing the point. 
130 seconds to scan the table per iteration. 
Even if you have 10K rows 
130 * 10^4 or 1.3*10^6 seconds.  ~361 hours

Compare that to 10K rows where you then select a single row in your sub select that has a list of all of the associated rows. 
You can then do  n number of get()s based on the data in the index. (If the data wasn't in the index itself)

Assuming that the data was in the index, that's one get(). This is sub second. 
Just to keep things simple assume 1 second. 
That's 10K seconds vs 1.3 million seconds.  (2 hours vs 361hours) 
Actually its more like 10ms  so its 100 seconds to run your code.  (So its like 2 minutes or so) 

Also since you're doing less work, you put less strain on the system.

Look, you're asking for help. You're fighting to maintain a bad design. 
Building the index table shouldn't take you more than a day to think, design and implement. 

So you tell me, 2 minutes vs 361 hours. Which would you choose?

HTH

-Mike


On Apr 19, 2012, at 10:04 AM, Narendra yadala wrote:

> Michael,
> 
> Thanks for the response. This is a real problem and not a class project.
> Boxes itself costed 9k ;)
> 
> I think there is some difference in understanding of the problem. The table
> has 2m rows but I am looking at the latest 10k rows only in the outer for
> loop. Only in the inner for loop i am trying to get all rows that contain
> the url that is given by the row in the outer for loop. So pseudo code is
> like this
> 
> All scanners have a caching of 128.
> 
> Scanner outerScanner =  tweetTable.getScanner(new Scan()); //This gets the
> entire row
> for (int index = 0; index < 10000; index++) {
> Result tweet =  outerScanner.next();
> NavigableMap<byte[],byte[]> linkFamilyMap =
> tweet.getFamilyMap(Bytes.toBytes("link"));
> String url = Bytes.toString( linkFamilyMap.firstKey());  //assuming only
> one link is there in the tweet.
> Scan linkScan = new Scan();
> linkScan.addColumn(Bytes.toBytes("link"), Bytes.toBytes(url)); //get only
> the link column family
> Scanner linkScanner = tweetTable.getScanner(linkScan); //ideally this for
> loop is taking 2 sec per sc
> for (Result linkResult = linkScanner.next(); linkResult != null;
> linkResult = linkScanner.next()) {
>    //do something with the link
> }
> linkScanner.close();
> 
>        //do a similar for loop for hashtags
> }
> 
> Each of my inner for loop is taking around 20 seconds (or more depending on
> number of rows returned by that particular scanner) for each of the 10k
> rows that I am processing and this is also triggering a lot of GC in turn.
> So it is 10000*40 seconds (4 days) for each thread. But the problem is that
> the batch process crashes before completion throwing IOException and
> SocketTimeoutException and sometimes GC OutOfMemory exceptions.
> 
> I will definitely take the much elegant approach that you mentioned
> eventually. I just wanted to get to the core of the issue before choosing
> the solution.
> 
> Thanks again.
> Narendra
> 
> On Thu, Apr 19, 2012 at 7:42 PM, Michel Segel <mi...@hotmail.com>wrote:
> 
>> Narendra,
>> 
>> Are you trying to solve a real problem, or is this a class project?
>> 
>> Your solution doesn't scale. It's a non starter. 130 seconds for each
>> iteration times 1 million seconds is how long? 130 million seconds, which
>> is ~36000 hours or over 4 years to complete.
>> (the numbers are rough but you get the idea...)
>> 
>> That's assuming that your table is static and doesn't change.
>> 
>> I didn't even ask if you were attempting any sort of server side filtering
>> which would reduce the amount of data you send back to the client because
>> it a moot point.
>> 
>> Finer tuning is also moot.
>> 
>> So you insert a row in one table. You then do n^2 operations to pull out
>> data.
>> The better solution is to insert data into 2 tables where you then have to
>> do 2n operations to get the same results. Thats per thread btw.  So if you
>> were running 10 threads, you would have 10n^2  operations versus 20n
>> operations to get the same result set.
>> 
>> A million row table... 1*10^13. Vs 2*10^6
>> 
>> I don't believe I mentioned anything about HBase's internals and this
>> solution works for any NoSQL database.
>> 
>> 
>> Sent from a remote device. Please excuse any typos...
>> 
>> Mike Segel
>> 
>> On Apr 19, 2012, at 7:03 AM, Narendra yadala <na...@gmail.com>
>> wrote:
>> 
>>> Hi Michel
>>> 
>>> Yes, that is exactly what I do in step 2. I am aware of the reason for
>> the
>>> scanner timeout exceptions. It is the time between two consecutive
>>> invocations of the next call on a specific scanner object. I increased
>> the
>>> scanner timeout to 10 min on the region server and still I keep seeing
>> the
>>> timeouts. So I reduced my scanner cache to 128.
>>> 
>>> Full table scan takes 130 seconds and there are 2.2 million rows in the
>>> table as of now. Each row is around 2 KB in size. I measured time for the
>>> full table scan by issuing `count` command from the hbase shell.
>>> 
>>> I kind of understood the fix that you are specifying, but do I need to
>>> change the table structure to fix this problem? All I do is a n^2
>> operation
>>> and even that fails with 10 different types of exceptions. It is mildly
>>> annoying that I need to know all the low level storage details of HBase
>> to
>>> do such a simple operation. And this is happening for just 14 parallel
>>> scanners. I am wondering what would happen when there are thousands of
>>> parallel scanners.
>>> 
>>> Please let me know if there is any configuration param change which would
>>> fix this issue.
>>> 
>>> Thanks a lot
>>> Narendra
>>> 
>>> On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <michael_segel@hotmail.com
>>> wrote:
>>> 
>>>> So in your step 2 you have the following:
>>>> FOREACH row IN TABLE alpha:
>>>>   SELECT something
>>>>   FROM TABLE alpha
>>>>   WHERE alpha.url = row.url
>>>> 
>>>> Right?
>>>> And you are wondering why you are getting timeouts?
>>>> ...
>>>> ...
>>>> And how long does it take to do a full table scan? ;-)
>>>> (there's more, but that's the first thing you should see...)
>>>> 
>>>> Try creating a second table where you invert the URL and key pair such
>>>> that for each URL, you have a set of your alpha table's keys?
>>>> 
>>>> Then you have the following...
>>>> FOREACH row IN TABLE alpha:
>>>> FETCH key-set FROM beta
>>>> WHERE beta.rowkey = alpha.url
>>>> 
>>>> Note I use FETCH to signify that you should get a single row in
>> response.
>>>> 
>>>> Does this make sense?
>>>> ( your second table is actually and index of the URL column in your
>> first
>>>> table)
>>>> 
>>>> HTH
>>>> 
>>>> Sent from a remote device. Please excuse any typos...
>>>> 
>>>> Mike Segel
>>>> 
>>>> On Apr 19, 2012, at 5:43 AM, Narendra yadala <narendra.yadala@gmail.com
>>> 
>>>> wrote:
>>>> 
>>>>> I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop
>>>> (4*32
>>>>> GB RAM and 4*6 TB disk space) cluster. We are using Cloudera
>> distribution
>>>>> for maintaining our cluster. I have a single tweets table in which we
>>>> store
>>>>> the tweets, one tweet per row (it has millions of rows currently).
>>>>> 
>>>>> Now I try to run a Java batch (not a map reduce) which does the
>>>> following :
>>>>> 
>>>>> 1. Open a scanner over the tweet table and read the tweets one after
>>>>> another. I set scanner caching to 128 rows as higher scanner caching
>> is
>>>>> leading to ScannerTimeoutExceptions. I scan over the first 10k rows
>>>> only.
>>>>> 2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are
>> there
>>>>> in that tweet and open another scanner over the tweets table to see
>> who
>>>>> else shared that link. This involves getting rows having that URL from
>>>> the
>>>>> entire table (not first 10k rows).
>>>>> 3. Do similar stuff as in step 2 for hashtags
>>>>> (hashtagcolfamily:hashtagvalue).
>>>>> 4. Do steps 1-3 in parallel for approximately 7-8 threads. This number
>>>>> can be higher (thousands also) later.
>>>>> 
>>>>> 
>>>>> When I run this batch I got the GC issue which is specified here
>>>>> 
>>>> 
>> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
>>>>> Then I tried to turn on the MSLAB feature and changed the GC settings
>> by
>>>>> specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
>>>>> Even after doing this, I am running into all kinds of IOExceptions
>>>>> and SocketTimeoutExceptions.
>>>>> 
>>>>> This Java batch opens approximately 7*2 (14) scanners open at a point
>> in
>>>>> time and still I am running into all kinds of troubles. I am wondering
>>>>> whether I can have thousands of parallel scanners with HBase when I
>> need
>>>> to
>>>>> scale.
>>>>> 
>>>>> It would be great to know whether I can open thousands/millions of
>>>> scanners
>>>>> in parallel with HBase efficiently.
>>>>> 
>>>>> Thanks
>>>>> Narendra
>>>> 
>> 


Re: HBase parallel scanner performance

Posted by Narendra yadala <na...@gmail.com>.
Michael,

Thanks for the response. This is a real problem and not a class project.
Boxes itself costed 9k ;)

I think there is some difference in understanding of the problem. The table
has 2m rows but I am looking at the latest 10k rows only in the outer for
loop. Only in the inner for loop i am trying to get all rows that contain
the url that is given by the row in the outer for loop. So pseudo code is
like this

All scanners have a caching of 128.

Scanner outerScanner =  tweetTable.getScanner(new Scan()); //This gets the
entire row
for (int index = 0; index < 10000; index++) {
 Result tweet =  outerScanner.next();
NavigableMap<byte[],byte[]> linkFamilyMap =
 tweet.getFamilyMap(Bytes.toBytes("link"));
 String url = Bytes.toString( linkFamilyMap.firstKey());  //assuming only
one link is there in the tweet.
Scan linkScan = new Scan();
 linkScan.addColumn(Bytes.toBytes("link"), Bytes.toBytes(url)); //get only
the link column family
Scanner linkScanner = tweetTable.getScanner(linkScan); //ideally this for
loop is taking 2 sec per sc
 for (Result linkResult = linkScanner.next(); linkResult != null;
linkResult = linkScanner.next()) {
    //do something with the link
 }
linkScanner.close();

        //do a similar for loop for hashtags
}

Each of my inner for loop is taking around 20 seconds (or more depending on
number of rows returned by that particular scanner) for each of the 10k
rows that I am processing and this is also triggering a lot of GC in turn.
So it is 10000*40 seconds (4 days) for each thread. But the problem is that
the batch process crashes before completion throwing IOException and
SocketTimeoutException and sometimes GC OutOfMemory exceptions.

I will definitely take the much elegant approach that you mentioned
eventually. I just wanted to get to the core of the issue before choosing
the solution.

Thanks again.
Narendra

On Thu, Apr 19, 2012 at 7:42 PM, Michel Segel <mi...@hotmail.com>wrote:

> Narendra,
>
> Are you trying to solve a real problem, or is this a class project?
>
> Your solution doesn't scale. It's a non starter. 130 seconds for each
> iteration times 1 million seconds is how long? 130 million seconds, which
> is ~36000 hours or over 4 years to complete.
> (the numbers are rough but you get the idea...)
>
> That's assuming that your table is static and doesn't change.
>
> I didn't even ask if you were attempting any sort of server side filtering
> which would reduce the amount of data you send back to the client because
> it a moot point.
>
> Finer tuning is also moot.
>
> So you insert a row in one table. You then do n^2 operations to pull out
> data.
> The better solution is to insert data into 2 tables where you then have to
> do 2n operations to get the same results. Thats per thread btw.  So if you
> were running 10 threads, you would have 10n^2  operations versus 20n
> operations to get the same result set.
>
> A million row table... 1*10^13. Vs 2*10^6
>
> I don't believe I mentioned anything about HBase's internals and this
> solution works for any NoSQL database.
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Apr 19, 2012, at 7:03 AM, Narendra yadala <na...@gmail.com>
> wrote:
>
> > Hi Michel
> >
> > Yes, that is exactly what I do in step 2. I am aware of the reason for
> the
> > scanner timeout exceptions. It is the time between two consecutive
> > invocations of the next call on a specific scanner object. I increased
> the
> > scanner timeout to 10 min on the region server and still I keep seeing
> the
> > timeouts. So I reduced my scanner cache to 128.
> >
> > Full table scan takes 130 seconds and there are 2.2 million rows in the
> > table as of now. Each row is around 2 KB in size. I measured time for the
> > full table scan by issuing `count` command from the hbase shell.
> >
> > I kind of understood the fix that you are specifying, but do I need to
> > change the table structure to fix this problem? All I do is a n^2
> operation
> > and even that fails with 10 different types of exceptions. It is mildly
> > annoying that I need to know all the low level storage details of HBase
> to
> > do such a simple operation. And this is happening for just 14 parallel
> > scanners. I am wondering what would happen when there are thousands of
> > parallel scanners.
> >
> > Please let me know if there is any configuration param change which would
> > fix this issue.
> >
> > Thanks a lot
> > Narendra
> >
> > On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <michael_segel@hotmail.com
> >wrote:
> >
> >> So in your step 2 you have the following:
> >> FOREACH row IN TABLE alpha:
> >>    SELECT something
> >>    FROM TABLE alpha
> >>    WHERE alpha.url = row.url
> >>
> >> Right?
> >> And you are wondering why you are getting timeouts?
> >> ...
> >> ...
> >> And how long does it take to do a full table scan? ;-)
> >> (there's more, but that's the first thing you should see...)
> >>
> >> Try creating a second table where you invert the URL and key pair such
> >> that for each URL, you have a set of your alpha table's keys?
> >>
> >> Then you have the following...
> >> FOREACH row IN TABLE alpha:
> >>  FETCH key-set FROM beta
> >>  WHERE beta.rowkey = alpha.url
> >>
> >> Note I use FETCH to signify that you should get a single row in
> response.
> >>
> >> Does this make sense?
> >> ( your second table is actually and index of the URL column in your
> first
> >> table)
> >>
> >> HTH
> >>
> >> Sent from a remote device. Please excuse any typos...
> >>
> >> Mike Segel
> >>
> >> On Apr 19, 2012, at 5:43 AM, Narendra yadala <narendra.yadala@gmail.com
> >
> >> wrote:
> >>
> >>> I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop
> >> (4*32
> >>> GB RAM and 4*6 TB disk space) cluster. We are using Cloudera
> distribution
> >>> for maintaining our cluster. I have a single tweets table in which we
> >> store
> >>> the tweets, one tweet per row (it has millions of rows currently).
> >>>
> >>> Now I try to run a Java batch (not a map reduce) which does the
> >> following :
> >>>
> >>>  1. Open a scanner over the tweet table and read the tweets one after
> >>>  another. I set scanner caching to 128 rows as higher scanner caching
> is
> >>>  leading to ScannerTimeoutExceptions. I scan over the first 10k rows
> >> only.
> >>>  2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are
> there
> >>>  in that tweet and open another scanner over the tweets table to see
> who
> >>>  else shared that link. This involves getting rows having that URL from
> >> the
> >>>  entire table (not first 10k rows).
> >>>  3. Do similar stuff as in step 2 for hashtags
> >>>  (hashtagcolfamily:hashtagvalue).
> >>>  4. Do steps 1-3 in parallel for approximately 7-8 threads. This number
> >>>  can be higher (thousands also) later.
> >>>
> >>>
> >>> When I run this batch I got the GC issue which is specified here
> >>>
> >>
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
> >>> Then I tried to turn on the MSLAB feature and changed the GC settings
> by
> >>> specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
> >>> Even after doing this, I am running into all kinds of IOExceptions
> >>> and SocketTimeoutExceptions.
> >>>
> >>> This Java batch opens approximately 7*2 (14) scanners open at a point
> in
> >>> time and still I am running into all kinds of troubles. I am wondering
> >>> whether I can have thousands of parallel scanners with HBase when I
> need
> >> to
> >>> scale.
> >>>
> >>> It would be great to know whether I can open thousands/millions of
> >> scanners
> >>> in parallel with HBase efficiently.
> >>>
> >>> Thanks
> >>> Narendra
> >>
>

Re: HBase parallel scanner performance

Posted by Michel Segel <mi...@hotmail.com>.
Narendra, 

Are you trying to solve a real problem, or is this a class project?

Your solution doesn't scale. It's a non starter. 130 seconds for each iteration times 1 million seconds is how long? 130 million seconds, which is ~36000 hours or over 4 years to complete.
(the numbers are rough but you get the idea...)

That's assuming that your table is static and doesn't change.

I didn't even ask if you were attempting any sort of server side filtering which would reduce the amount of data you send back to the client because it a moot point. 

Finer tuning is also moot.

So you insert a row in one table. You then do n^2 operations to pull out data.
The better solution is to insert data into 2 tables where you then have to do 2n operations to get the same results. Thats per thread btw.  So if you were running 10 threads, you would have 10n^2  operations versus 20n operations to get the same result set.

A million row table... 1*10^13. Vs 2*10^6 

I don't believe I mentioned anything about HBase's internals and this solution works for any NoSQL database.


Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 19, 2012, at 7:03 AM, Narendra yadala <na...@gmail.com> wrote:

> Hi Michel
> 
> Yes, that is exactly what I do in step 2. I am aware of the reason for the
> scanner timeout exceptions. It is the time between two consecutive
> invocations of the next call on a specific scanner object. I increased the
> scanner timeout to 10 min on the region server and still I keep seeing the
> timeouts. So I reduced my scanner cache to 128.
> 
> Full table scan takes 130 seconds and there are 2.2 million rows in the
> table as of now. Each row is around 2 KB in size. I measured time for the
> full table scan by issuing `count` command from the hbase shell.
> 
> I kind of understood the fix that you are specifying, but do I need to
> change the table structure to fix this problem? All I do is a n^2 operation
> and even that fails with 10 different types of exceptions. It is mildly
> annoying that I need to know all the low level storage details of HBase to
> do such a simple operation. And this is happening for just 14 parallel
> scanners. I am wondering what would happen when there are thousands of
> parallel scanners.
> 
> Please let me know if there is any configuration param change which would
> fix this issue.
> 
> Thanks a lot
> Narendra
> 
> On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <mi...@hotmail.com>wrote:
> 
>> So in your step 2 you have the following:
>> FOREACH row IN TABLE alpha:
>>    SELECT something
>>    FROM TABLE alpha
>>    WHERE alpha.url = row.url
>> 
>> Right?
>> And you are wondering why you are getting timeouts?
>> ...
>> ...
>> And how long does it take to do a full table scan? ;-)
>> (there's more, but that's the first thing you should see...)
>> 
>> Try creating a second table where you invert the URL and key pair such
>> that for each URL, you have a set of your alpha table's keys?
>> 
>> Then you have the following...
>> FOREACH row IN TABLE alpha:
>>  FETCH key-set FROM beta
>>  WHERE beta.rowkey = alpha.url
>> 
>> Note I use FETCH to signify that you should get a single row in response.
>> 
>> Does this make sense?
>> ( your second table is actually and index of the URL column in your first
>> table)
>> 
>> HTH
>> 
>> Sent from a remote device. Please excuse any typos...
>> 
>> Mike Segel
>> 
>> On Apr 19, 2012, at 5:43 AM, Narendra yadala <na...@gmail.com>
>> wrote:
>> 
>>> I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop
>> (4*32
>>> GB RAM and 4*6 TB disk space) cluster. We are using Cloudera distribution
>>> for maintaining our cluster. I have a single tweets table in which we
>> store
>>> the tweets, one tweet per row (it has millions of rows currently).
>>> 
>>> Now I try to run a Java batch (not a map reduce) which does the
>> following :
>>> 
>>>  1. Open a scanner over the tweet table and read the tweets one after
>>>  another. I set scanner caching to 128 rows as higher scanner caching is
>>>  leading to ScannerTimeoutExceptions. I scan over the first 10k rows
>> only.
>>>  2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are there
>>>  in that tweet and open another scanner over the tweets table to see who
>>>  else shared that link. This involves getting rows having that URL from
>> the
>>>  entire table (not first 10k rows).
>>>  3. Do similar stuff as in step 2 for hashtags
>>>  (hashtagcolfamily:hashtagvalue).
>>>  4. Do steps 1-3 in parallel for approximately 7-8 threads. This number
>>>  can be higher (thousands also) later.
>>> 
>>> 
>>> When I run this batch I got the GC issue which is specified here
>>> 
>> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
>>> Then I tried to turn on the MSLAB feature and changed the GC settings by
>>> specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
>>> Even after doing this, I am running into all kinds of IOExceptions
>>> and SocketTimeoutExceptions.
>>> 
>>> This Java batch opens approximately 7*2 (14) scanners open at a point in
>>> time and still I am running into all kinds of troubles. I am wondering
>>> whether I can have thousands of parallel scanners with HBase when I need
>> to
>>> scale.
>>> 
>>> It would be great to know whether I can open thousands/millions of
>> scanners
>>> in parallel with HBase efficiently.
>>> 
>>> Thanks
>>> Narendra
>> 

Re: HBase parallel scanner performance

Posted by Narendra yadala <na...@gmail.com>.
Hi Michel

Yes, that is exactly what I do in step 2. I am aware of the reason for the
scanner timeout exceptions. It is the time between two consecutive
invocations of the next call on a specific scanner object. I increased the
scanner timeout to 10 min on the region server and still I keep seeing the
timeouts. So I reduced my scanner cache to 128.

Full table scan takes 130 seconds and there are 2.2 million rows in the
table as of now. Each row is around 2 KB in size. I measured time for the
full table scan by issuing `count` command from the hbase shell.

I kind of understood the fix that you are specifying, but do I need to
change the table structure to fix this problem? All I do is a n^2 operation
and even that fails with 10 different types of exceptions. It is mildly
annoying that I need to know all the low level storage details of HBase to
do such a simple operation. And this is happening for just 14 parallel
scanners. I am wondering what would happen when there are thousands of
parallel scanners.

Please let me know if there is any configuration param change which would
fix this issue.

Thanks a lot
Narendra

On Thu, Apr 19, 2012 at 4:40 PM, Michel Segel <mi...@hotmail.com>wrote:

> So in your step 2 you have the following:
> FOREACH row IN TABLE alpha:
>     SELECT something
>     FROM TABLE alpha
>     WHERE alpha.url = row.url
>
> Right?
> And you are wondering why you are getting timeouts?
> ...
> ...
> And how long does it take to do a full table scan? ;-)
> (there's more, but that's the first thing you should see...)
>
> Try creating a second table where you invert the URL and key pair such
> that for each URL, you have a set of your alpha table's keys?
>
> Then you have the following...
> FOREACH row IN TABLE alpha:
>   FETCH key-set FROM beta
>   WHERE beta.rowkey = alpha.url
>
> Note I use FETCH to signify that you should get a single row in response.
>
> Does this make sense?
> ( your second table is actually and index of the URL column in your first
> table)
>
> HTH
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Apr 19, 2012, at 5:43 AM, Narendra yadala <na...@gmail.com>
> wrote:
>
> > I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop
> (4*32
> > GB RAM and 4*6 TB disk space) cluster. We are using Cloudera distribution
> > for maintaining our cluster. I have a single tweets table in which we
> store
> > the tweets, one tweet per row (it has millions of rows currently).
> >
> > Now I try to run a Java batch (not a map reduce) which does the
> following :
> >
> >   1. Open a scanner over the tweet table and read the tweets one after
> >   another. I set scanner caching to 128 rows as higher scanner caching is
> >   leading to ScannerTimeoutExceptions. I scan over the first 10k rows
> only.
> >   2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are there
> >   in that tweet and open another scanner over the tweets table to see who
> >   else shared that link. This involves getting rows having that URL from
> the
> >   entire table (not first 10k rows).
> >   3. Do similar stuff as in step 2 for hashtags
> >   (hashtagcolfamily:hashtagvalue).
> >   4. Do steps 1-3 in parallel for approximately 7-8 threads. This number
> >   can be higher (thousands also) later.
> >
> >
> > When I run this batch I got the GC issue which is specified here
> >
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
> > Then I tried to turn on the MSLAB feature and changed the GC settings by
> > specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
> > Even after doing this, I am running into all kinds of IOExceptions
> > and SocketTimeoutExceptions.
> >
> > This Java batch opens approximately 7*2 (14) scanners open at a point in
> > time and still I am running into all kinds of troubles. I am wondering
> > whether I can have thousands of parallel scanners with HBase when I need
> to
> > scale.
> >
> > It would be great to know whether I can open thousands/millions of
> scanners
> > in parallel with HBase efficiently.
> >
> > Thanks
> > Narendra
>

Re: HBase parallel scanner performance

Posted by Michel Segel <mi...@hotmail.com>.
So in your step 2 you have the following:
FOREACH row IN TABLE alpha:
     SELECT something
     FROM TABLE alpha 
     WHERE alpha.url = row.url

Right?
And you are wondering why you are getting timeouts?
...
...
And how long does it take to do a full table scan? ;-)
(there's more, but that's the first thing you should see...)

Try creating a second table where you invert the URL and key pair such that for each URL, you have a set of your alpha table's keys?

Then you have the following...
FOREACH row IN TABLE alpha:
   FETCH key-set FROM beta 
   WHERE beta.rowkey = alpha.url

Note I use FETCH to signify that you should get a single row in response.

Does this make sense?
( your second table is actually and index of the URL column in your first table)

HTH 

Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 19, 2012, at 5:43 AM, Narendra yadala <na...@gmail.com> wrote:

> I have an issue with my HBase cluster. We have a 4 node HBase/Hadoop (4*32
> GB RAM and 4*6 TB disk space) cluster. We are using Cloudera distribution
> for maintaining our cluster. I have a single tweets table in which we store
> the tweets, one tweet per row (it has millions of rows currently).
> 
> Now I try to run a Java batch (not a map reduce) which does the following :
> 
>   1. Open a scanner over the tweet table and read the tweets one after
>   another. I set scanner caching to 128 rows as higher scanner caching is
>   leading to ScannerTimeoutExceptions. I scan over the first 10k rows only.
>   2. For each tweet, extract URLs (linkcolfamily:urlvalue) that are there
>   in that tweet and open another scanner over the tweets table to see who
>   else shared that link. This involves getting rows having that URL from the
>   entire table (not first 10k rows).
>   3. Do similar stuff as in step 2 for hashtags
>   (hashtagcolfamily:hashtagvalue).
>   4. Do steps 1-3 in parallel for approximately 7-8 threads. This number
>   can be higher (thousands also) later.
> 
> 
> When I run this batch I got the GC issue which is specified here
> http://www.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/
> Then I tried to turn on the MSLAB feature and changed the GC settings by
> specifying  -XX:+UseParNewGC  and  -XX:+UseConcMarkSweepGC JVM flags.
> Even after doing this, I am running into all kinds of IOExceptions
> and SocketTimeoutExceptions.
> 
> This Java batch opens approximately 7*2 (14) scanners open at a point in
> time and still I am running into all kinds of troubles. I am wondering
> whether I can have thousands of parallel scanners with HBase when I need to
> scale.
> 
> It would be great to know whether I can open thousands/millions of scanners
> in parallel with HBase efficiently.
> 
> Thanks
> Narendra