You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Billy <sa...@pearsonwholesale.com> on 2007/12/20 21:37:08 UTC

hbase master heap space

I not sure of this but why does the master server use up so much memory. I 
been running an script that been inserting data into a table for a little 
over 24 hours and the master crashed because of java.lang.OutOfMemoryError: 
Java heap space.

So my question is why does the master use up so much memory at most it 
should store the -ROOT-,.META. tables in memory and block to table mapping.

Is it cache or a memory leak?

I am using the rest interface so could that be the reason?

I inserted according to the high edit ids on all the region servers about
51,932,760 edits and the master ran out of memory with a heap of about 1GB.

The other side to this is the data I inserted is only taking up 886.61 MB 
and that's with
dfs.replication set to 2 so half that is only 440MB of data compressed at 
the block level.
>From what I understand the master should have lower memory and cpu usage and 
the namenode on hadoop should be the memory hog it has to keep up with all 
the data about the blocks. 




Re: hbase master heap space

Posted by Billy <sa...@pearsonwholesale.com>.
any ideas on what might be causing the memory usages?

Billy

"Billy" <sa...@pearsonwholesale.com> wrote in 
message news:fki4cm$nu5$1@ger.gmane.org...
>I tested it with out scanners and just inserting data casue the memory 
>useage to rise and never recover from what I seen.
>
> I submitted a job that download web pages striped out the data needed and 
> inserted it in to the table via php to REST. i used a trext file for the 
> input so no reads to the table. table splits where never more then 15 so 
> cached meta if any should not be the problem.
>
> a copy of the insert php function code I use to insert data is below
>
> But basicly I open a socket connection to the REST interface
> send send this:
>
> fputs( $fp, "PUT /api/".$table."/row/".$row."/ HTTP/1.1\r\n");
> fputs( $fp, "Host: ".$master_ip."\r\n");
> fputs( $fp, "Content-type: text/xml\r\n" );
> fputs( $fp, "Content-length: ".strlen($xml)."\r\n");
> fputs( $fp, "Connection: close\r\n\r\n");
> fputs( $fp, $xml."\r\n\r\n");
>
> Then get the returning data from the socket and close the connection.
>
>
> The REST interface starts out with about 38MB used in memory then climbs. 
> I have not let it crash with a copy running out side of the master but it 
> did run out of memory while using the masters REST Interface. It takes a 
> lot of transactions to use up 1gb of memory I checked each table server 
> and the sum of the edit ids on a new install was about 51 million using 
> just the master as the interface.
>
> This causes me to have to kill and restart the proc to recover memory from 
> time to time to keep it from crashing.
> Over all speed remains the same for transactions sec and I do not see any 
> other problems that I can tell.
>
> {PHP Code Start}
> <?
> function hbase_insert($master_ip,$table,$row,$col,$data){
> //echo "row:".$row." - col:".$col." - data:".$data."\n";
> // make all arrays
> if (!is_array($col)) {
>  $column[0] = $col;
> } else {
>  $column = $col;
> } // end if
> unset($col);
> if (!is_array($data)){
>  $adata[0] = $data;
> } else {
>  $adata = $data;
> } // end if
> unset($data);
> // loop col array building xml to submit
> $xml = '<?xml version="1.0" encoding="UTF-8"?><row>';
> for ($count=count($column), $zz=0; $zz<$count; $zz++){
>  //make sure the col has a : on the end if its not a child
>  if (!ereg(":",$column[$zz])){
>   $column[$zz] = $column[$zz].":";
>  } // end if
>  //append each column to the xml filed
>  $xml .= 
> '<column><name>'.$column[$zz].'</name><value>'.base64_encode($adata[$zz]).'</value></column>';
> } // end for
> $xml .= '</row>';
> //echo $xml,"\n";
> $fp = hbase_connect($master_ip);
> if (!$fp){
>  return "failed";
> } // endif
> fputs( $fp, "PUT /api/".$table."/row/".$row."/ HTTP/1.1\r\n");
> fputs( $fp, "Host: ".$master_ip."\r\n");
> fputs( $fp, "Content-type: text/xml\r\n" );
> fputs( $fp, "Content-length: ".strlen($xml)."\r\n");
> fputs( $fp, "Connection: close\r\n\r\n");
> fputs( $fp, $xml."\r\n\r\n");
>
> //loop through the response from the server
> $buff = "";
> while(!feof($fp)){
>  $buff .= fgets($fp, 1024);
> } // end while
> fclose($fp);
> if (!ereg("HTTP/1.1 200 OK",$buff)){
>  return $buff;
> } else {
>  return "success";
> } // end if
> } // end function
>
> function hbase_connect($master_ip){
> $fp = fsockopen("127.0.0.1", "60050", $errno, $errstr, $timeout = 10);
> if ($fp){
>  echo "Localhost\n";
>  return $fp;
> } else {
>  $fp = fsockopen($master_ip, "60010", $errno, $errstr, $timeout = 10);
>  if ($fp){
>   echo "Master\n";
>   return $fp;
>  } else {
>   return -1;
>  } // end if
> } // end if
> } // end function
>
> ?>
> {PHP Code End}
>
> "Bryan Duxbury" <br...@rapleaf.com> wrote in 
> message 
> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>> Are you closing the scanners when you're done? If not, those might be 
>> hanging around for a long time. I don't think we've built in the  proper 
>> timeout logic to make that work by itself.
>>
>> -Bryan
>>
>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>
>>> I was thanking the same thing and been running REST outside of the 
>>> Master on
>>> each server for about 5 hours now and used the master as a backup  if 
>>> local
>>> rest interface failed. You are right I seen a little faster  processing 
>>> time
>>> from doing this vs. using just the master.
>>>
>>> Seams the problem is not with the master its self looks like REST  is 
>>> using
>>> up more and more memory not sure but I thank its to do with inserts 
>>> maybe
>>> not but the memory usage is going up I an doing a scanner 2 threads 
>>> reading
>>> rows and processing the data and inserting it in to a separate table
>>> building a inverted index.
>>>
>>> I will restart everything when this job is done and try to do just 
>>> inserts
>>> and see if its the scanner or inserts.
>>>
>>> The master is holding at about 75mb and the rest interfaces are up  to 
>>> 400MB
>>> and slowly going up on the ones running the jobs.
>>>
>>> I am still testing I will see what else I can come up with.
>>>
>>> Billy
>>>
>>>
>>> "stack" <st...@duboce.net> wrote in message
>>> news:476C1AA8.3030306@duboce.net...
>>>> Hey Billy:
>>>>
>>>> Master itself should use little memory and though it is not out of  the
>>>> realm of possibiliites, it should not have a leak.
>>>>
>>>> Are you running with the default heap size?  You might want to  give it
>>>> more memory if you are (See
>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>
>>>> If you are uploading all via the REST server running on the  master, 
>>>> the
>>>> problem as you speculate, could be in the REST servlet itself  (though 
>>>> it
>>>> looks like it shouldn't be holding on to anything having given it a
>>>> cursory glance).  You could try running the REST server  independent of 
>>>> the
>>>> master.  Grep for 'Starting the REST Server' in this page,
>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If  you 
>>>> are
>>>> only running one REST instance, your upload might go faster if you  run
>>>> multiple).
>>>>
>>>> St.Ack
>>>>
>>>>
>>>> Billy wrote:
>>>>> I forgot to say that once restart the master only uses about 70mb of
>>>>> memory
>>>>>
>>>>> Billy
>>>>>
>>>>> "Billy" <sa...@pearsonwholesale.com> 
>>>>> wrote
>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>
>>>>>> I not sure of this but why does the master server use up so much 
>>>>>> memory.
>>>>>> I been running an script that been inserting data into a table  for a
>>>>>> little over 24 hours and the master crashed because of
>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>
>>>>>> So my question is why does the master use up so much memory at  most 
>>>>>> it
>>>>>> should store the -ROOT-,.META. tables in memory and block to table
>>>>>> mapping.
>>>>>>
>>>>>> Is it cache or a memory leak?
>>>>>>
>>>>>> I am using the rest interface so could that be the reason?
>>>>>>
>>>>>> I inserted according to the high edit ids on all the region servers
>>>>>> about
>>>>>> 51,932,760 edits and the master ran out of memory with a heap of 
>>>>>> about
>>>>>> 1GB.
>>>>>>
>>>>>> The other side to this is the data I inserted is only taking up 
>>>>>> 886.61
>>>>>> MB and that's with
>>>>>> dfs.replication set to 2 so half that is only 440MB of data 
>>>>>> compressed
>>>>>> at the block level.
>>>>>> From what I understand the master should have lower memory and  cpu 
>>>>>> usage
>>>>>> and the namenode on hadoop should be the memory hog it has to  keep 
>>>>>> up
>>>>>> with all the data about the blocks.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>
> 




Re: hbase master heap space

Posted by Billy <sa...@pearsonwholesale.com>.
I tested it with out scanners and just inserting data casue the memory 
useage to rise and never recover from what I seen.

I submitted a job that download web pages striped out the data needed and 
inserted it in to the table via php to REST. i used a trext file for the 
input so no reads to the table. table splits where never more then 15 so 
cached meta if any should not be the problem.

a copy of the insert php function code I use to insert data is below

But basicly I open a socket connection to the REST interface
send send this:

fputs( $fp, "PUT /api/".$table."/row/".$row."/ HTTP/1.1\r\n");
fputs( $fp, "Host: ".$master_ip."\r\n");
fputs( $fp, "Content-type: text/xml\r\n" );
fputs( $fp, "Content-length: ".strlen($xml)."\r\n");
fputs( $fp, "Connection: close\r\n\r\n");
fputs( $fp, $xml."\r\n\r\n");

Then get the returning data from the socket and close the connection.


The REST interface starts out with about 38MB used in memory then climbs. I 
have not let it crash with a copy running out side of the master but it did 
run out of memory while using the masters REST Interface. It takes a lot of 
transactions to use up 1gb of memory I checked each table server and the sum 
of the edit ids on a new install was about 51 million using just the master 
as the interface.

This causes me to have to kill and restart the proc to recover memory from 
time to time to keep it from crashing.
Over all speed remains the same for transactions sec and I do not see any 
other problems that I can tell.

{PHP Code Start}
<?
function hbase_insert($master_ip,$table,$row,$col,$data){
 //echo "row:".$row." - col:".$col." - data:".$data."\n";
 // make all arrays
 if (!is_array($col)) {
  $column[0] = $col;
 } else {
  $column = $col;
 } // end if
 unset($col);
 if (!is_array($data)){
  $adata[0] = $data;
 } else {
  $adata = $data;
 } // end if
 unset($data);
 // loop col array building xml to submit
 $xml = '<?xml version="1.0" encoding="UTF-8"?><row>';
 for ($count=count($column), $zz=0; $zz<$count; $zz++){
  //make sure the col has a : on the end if its not a child
  if (!ereg(":",$column[$zz])){
   $column[$zz] = $column[$zz].":";
  } // end if
  //append each column to the xml filed
  $xml .= 
'<column><name>'.$column[$zz].'</name><value>'.base64_encode($adata[$zz]).'</value></column>';
 } // end for
 $xml .= '</row>';
 //echo $xml,"\n";
 $fp = hbase_connect($master_ip);
 if (!$fp){
  return "failed";
 } // endif
 fputs( $fp, "PUT /api/".$table."/row/".$row."/ HTTP/1.1\r\n");
 fputs( $fp, "Host: ".$master_ip."\r\n");
 fputs( $fp, "Content-type: text/xml\r\n" );
 fputs( $fp, "Content-length: ".strlen($xml)."\r\n");
 fputs( $fp, "Connection: close\r\n\r\n");
 fputs( $fp, $xml."\r\n\r\n");

 //loop through the response from the server
 $buff = "";
 while(!feof($fp)){
  $buff .= fgets($fp, 1024);
 } // end while
 fclose($fp);
 if (!ereg("HTTP/1.1 200 OK",$buff)){
  return $buff;
 } else {
  return "success";
 } // end if
} // end function

function hbase_connect($master_ip){
 $fp = fsockopen("127.0.0.1", "60050", $errno, $errstr, $timeout = 10);
 if ($fp){
  echo "Localhost\n";
  return $fp;
 } else {
  $fp = fsockopen($master_ip, "60010", $errno, $errstr, $timeout = 10);
  if ($fp){
   echo "Master\n";
   return $fp;
  } else {
   return -1;
  } // end if
 } // end if
} // end function

?>
{PHP Code End}

"Bryan Duxbury" <br...@rapleaf.com> wrote in 
message news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
> Are you closing the scanners when you're done? If not, those might be 
> hanging around for a long time. I don't think we've built in the  proper 
> timeout logic to make that work by itself.
>
> -Bryan
>
> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>
>> I was thanking the same thing and been running REST outside of the 
>> Master on
>> each server for about 5 hours now and used the master as a backup  if 
>> local
>> rest interface failed. You are right I seen a little faster  processing 
>> time
>> from doing this vs. using just the master.
>>
>> Seams the problem is not with the master its self looks like REST  is 
>> using
>> up more and more memory not sure but I thank its to do with inserts 
>> maybe
>> not but the memory usage is going up I an doing a scanner 2 threads 
>> reading
>> rows and processing the data and inserting it in to a separate table
>> building a inverted index.
>>
>> I will restart everything when this job is done and try to do just 
>> inserts
>> and see if its the scanner or inserts.
>>
>> The master is holding at about 75mb and the rest interfaces are up  to 
>> 400MB
>> and slowly going up on the ones running the jobs.
>>
>> I am still testing I will see what else I can come up with.
>>
>> Billy
>>
>>
>> "stack" <st...@duboce.net> wrote in message
>> news:476C1AA8.3030306@duboce.net...
>>> Hey Billy:
>>>
>>> Master itself should use little memory and though it is not out of  the
>>> realm of possibiliites, it should not have a leak.
>>>
>>> Are you running with the default heap size?  You might want to  give it
>>> more memory if you are (See
>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>
>>> If you are uploading all via the REST server running on the  master, the
>>> problem as you speculate, could be in the REST servlet itself  (though 
>>> it
>>> looks like it shouldn't be holding on to anything having given it a
>>> cursory glance).  You could try running the REST server  independent of 
>>> the
>>> master.  Grep for 'Starting the REST Server' in this page,
>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If  you 
>>> are
>>> only running one REST instance, your upload might go faster if you  run
>>> multiple).
>>>
>>> St.Ack
>>>
>>>
>>> Billy wrote:
>>>> I forgot to say that once restart the master only uses about 70mb of
>>>> memory
>>>>
>>>> Billy
>>>>
>>>> "Billy" <sa...@pearsonwholesale.com> wrote
>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>
>>>>> I not sure of this but why does the master server use up so much 
>>>>> memory.
>>>>> I been running an script that been inserting data into a table  for a
>>>>> little over 24 hours and the master crashed because of
>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>
>>>>> So my question is why does the master use up so much memory at  most 
>>>>> it
>>>>> should store the -ROOT-,.META. tables in memory and block to table
>>>>> mapping.
>>>>>
>>>>> Is it cache or a memory leak?
>>>>>
>>>>> I am using the rest interface so could that be the reason?
>>>>>
>>>>> I inserted according to the high edit ids on all the region servers
>>>>> about
>>>>> 51,932,760 edits and the master ran out of memory with a heap of 
>>>>> about
>>>>> 1GB.
>>>>>
>>>>> The other side to this is the data I inserted is only taking up 
>>>>> 886.61
>>>>> MB and that's with
>>>>> dfs.replication set to 2 so half that is only 440MB of data 
>>>>> compressed
>>>>> at the block level.
>>>>> From what I understand the master should have lower memory and  cpu 
>>>>> usage
>>>>> and the namenode on hadoop should be the memory hog it has to  keep up
>>>>> with all the data about the blocks.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
> 




RE: hbase master heap space

Posted by Jim Kellerman <ji...@powerset.com>.
Scanners time out on the region server side and resources get cleaned
up, but that does not happen on the client side unless you later call
the scanner again and the region server tells the client that that
scanner has timed out. In short, any application that uses a scanner
should close it. It might be a good idea to add a scanner watcher
on the client that shuts them down.

---
Jim Kellerman, Senior Engineer; Powerset


> -----Original Message-----
> From: Bryan Duxbury [mailto:bryan@rapleaf.com]
> Sent: Friday, December 21, 2007 5:51 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: hbase master heap space
>
> Are you closing the scanners when you're done? If not, those
> might be hanging around for a long time. I don't think we've
> built in the proper timeout logic to make that work by itself.
>
> -Bryan
>
> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>
> > I was thanking the same thing and been running REST outside of the
> > Master on each server for about 5 hours now and used the
> master as a
> > backup if local rest interface failed. You are right I seen
> a little
> > faster processing time from doing this vs. using just the master.
> >
> > Seams the problem is not with the master its self looks
> like REST is
> > using up more and more memory not sure but I thank its to do with
> > inserts maybe not but the memory usage is going up I an doing a
> > scanner 2 threads reading rows and processing the data and
> inserting
> > it in to a separate table building a inverted index.
> >
> > I will restart everything when this job is done and try to do just
> > inserts and see if its the scanner or inserts.
> >
> > The master is holding at about 75mb and the rest interfaces
> are up to
> > 400MB and slowly going up on the ones running the jobs.
> >
> > I am still testing I will see what else I can come up with.
> >
> > Billy
> >
> >
> > "stack" <st...@duboce.net> wrote in message
> > news:476C1AA8.3030306@duboce.net...
> >> Hey Billy:
> >>
> >> Master itself should use little memory and though it is not out of
> >> the realm of possibiliites, it should not have a leak.
> >>
> >> Are you running with the default heap size?  You might
> want to give
> >> it more memory if you are (See
> >> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
> >>
> >> If you are uploading all via the REST server running on
> the master,
> >> the problem as you speculate, could be in the REST servlet itself
> >> (though it looks like it shouldn't be holding on to
> anything having
> >> given it a cursory glance).  You could try running the REST server
> >> independent of the master.  Grep for 'Starting the REST Server' in
> >> this page,
> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for
> >> how (If you are only running one REST instance, your
> upload might go
> >> faster if you run multiple).
> >>
> >> St.Ack
> >>
> >>
> >> Billy wrote:
> >>> I forgot to say that once restart the master only uses
> about 70mb of
> >>> memory
> >>>
> >>> Billy
> >>>
> >>> "Billy" <sa...@pearsonwholesale.com> wrote in message
> >>> news:fkejpo$u8c$1@ger.gmane.org...
> >>>
> >>>> I not sure of this but why does the master server use up so much
> >>>> memory.
> >>>> I been running an script that been inserting data into a
> table for
> >>>> a little over 24 hours and the master crashed because of
> >>>> java.lang.OutOfMemoryError: Java heap space.
> >>>>
> >>>> So my question is why does the master use up so much
> memory at most
> >>>> it should store the -ROOT-,.META. tables in memory and block to
> >>>> table mapping.
> >>>>
> >>>> Is it cache or a memory leak?
> >>>>
> >>>> I am using the rest interface so could that be the reason?
> >>>>
> >>>> I inserted according to the high edit ids on all the
> region servers
> >>>> about 51,932,760 edits and the master ran out of memory
> with a heap
> >>>> of about 1GB.
> >>>>
> >>>> The other side to this is the data I inserted is only taking up
> >>>> 886.61
> >>>> MB and that's with
> >>>> dfs.replication set to 2 so half that is only 440MB of data
> >>>> compressed at the block level.
> >>>> From what I understand the master should have lower
> memory and cpu
> >>>> usage and the namenode on hadoop should be the memory
> hog it has to
> >>>> keep up with all the data about the blocks.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >
> >
> >
>
>

Re: hbase master heap space

Posted by Billy <sa...@pearsonwholesale.com>.
I applied the patch to my truck version 605657

but here what I am getting

called for results: column from my search_index table
PUT /api/search_index/scanner?column=results:

Returned location:
/api/search_index/scanner/3977a5e4

called
DELETE /api/search_index/scanner/3977a5e4

returned:
HTTP/1.1 500 3
Date: Sun, 30 Dec 2007 04:36:00 GMT
Server: Jetty/5.1.4 (Linux/2.6.9-67.0.1.ELsmp i386 java/1.5.0_12
Connection: close
Content-Type: text/html
Content-Length: 1230

<html>
<head>
<title>Error 500 3</title>
</head>
<body>
<h2>HTTP ERROR: 500</h2><pre>3</pre>
<p>RequestURI=/api/search_index/scanner/3977a5e4</p>
<p><i><small><a href="http://jetty.mortbay.org">Powered by 
Jetty://</a></small></i></p>


Billy


"Bryan Duxbury" <br...@rapleaf.com> wrote in 
message news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
> I've created an issue and submitted a patch to fix the problem.
> Billy, can you download the patch and check to see if it works alright?
>
> https://issues.apache.org/jira/browse/HADOOP-2504
>
> -Bryan
>
> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>
>> I checked and added the delete option to my code for the scanner
>> based on
>> the api from wiki but it looks like its not working at this time
>> basedo nthe
>> code and responce I got form the rest interfase. i get a "Not
>> hooked back up
>> yet" responce any idea on when this will be fixed?
>>
>> Thanks
>>
>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>> ScannerHandler.java
>>
>> public void doDelete(HttpServletRequest request, HttpServletResponse
>> response,
>> String[] pathSegments)
>> throws ServletException, IOException {
>> doMethodNotAllowed(response, "Not hooked back up yet!");
>> }
>>
>>
>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>> message 
>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>> Are you closing the scanners when you're done? If not, those might be
>>> hanging around for a long time. I don't think we've built in the
>>> proper
>>> timeout logic to make that work by itself.
>>>
>>> -Bryan
>>>
>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>
>>>> I was thanking the same thing and been running REST outside of the
>>>> Master on
>>>> each server for about 5 hours now and used the master as a
>>>> backup  if
>>>> local
>>>> rest interface failed. You are right I seen a little faster
>>>> processing
>>>> time
>>>> from doing this vs. using just the master.
>>>>
>>>> Seams the problem is not with the master its self looks like
>>>> REST  is
>>>> using
>>>> up more and more memory not sure but I thank its to do with inserts
>>>> maybe
>>>> not but the memory usage is going up I an doing a scanner 2 threads
>>>> reading
>>>> rows and processing the data and inserting it in to a separate table
>>>> building a inverted index.
>>>>
>>>> I will restart everything when this job is done and try to do just
>>>> inserts
>>>> and see if its the scanner or inserts.
>>>>
>>>> The master is holding at about 75mb and the rest interfaces are
>>>> up  to
>>>> 400MB
>>>> and slowly going up on the ones running the jobs.
>>>>
>>>> I am still testing I will see what else I can come up with.
>>>>
>>>> Billy
>>>>
>>>>
>>>> "stack" <st...@duboce.net> wrote in 
>>>> message
>>>> news:476C1AA8.3030306@duboce.net...
>>>>> Hey Billy:
>>>>>
>>>>> Master itself should use little memory and though it is not out
>>>>> of  the
>>>>> realm of possibiliites, it should not have a leak.
>>>>>
>>>>> Are you running with the default heap size?  You might want to
>>>>> give it
>>>>> more memory if you are (See
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>
>>>>> If you are uploading all via the REST server running on the
>>>>> master, the
>>>>> problem as you speculate, could be in the REST servlet itself
>>>>> (though
>>>>> it
>>>>> looks like it shouldn't be holding on to anything having given it a
>>>>> cursory glance).  You could try running the REST server
>>>>> independent of
>>>>> the
>>>>> master.  Grep for 'Starting the REST Server' in this page,
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>> (If  you
>>>>> are
>>>>> only running one REST instance, your upload might go faster if
>>>>> you  run
>>>>> multiple).
>>>>>
>>>>> St.Ack
>>>>>
>>>>>
>>>>> Billy wrote:
>>>>>> I forgot to say that once restart the master only uses about
>>>>>> 70mb of
>>>>>> memory
>>>>>>
>>>>>> Billy
>>>>>>
>>>>>> "Billy" <sa...@pearsonwholesale.com> 
>>>>>> wrote
>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>
>>>>>>> I not sure of this but why does the master server use up so much
>>>>>>> memory.
>>>>>>> I been running an script that been inserting data into a
>>>>>>> table  for a
>>>>>>> little over 24 hours and the master crashed because of
>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>
>>>>>>> So my question is why does the master use up so much memory
>>>>>>> at  most
>>>>>>> it
>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>> table
>>>>>>> mapping.
>>>>>>>
>>>>>>> Is it cache or a memory leak?
>>>>>>>
>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>
>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>> servers
>>>>>>> about
>>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>>> about
>>>>>>> 1GB.
>>>>>>>
>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>> 886.61
>>>>>>> MB and that's with
>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>> compressed
>>>>>>> at the block level.
>>>>>>> From what I understand the master should have lower memory
>>>>>>> and  cpu
>>>>>>> usage
>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>> keep up
>>>>>>> with all the data about the blocks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
> 




Re: hbase master heap space

Posted by Billy <sa...@pearsonwholesale.com>.
also noticed that's the return code should be 200 from the code but on wiki 
it shows it to be

http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest
HTTP 202 (Accepted) if it can be closed. HTTP 404 (Not Found) if the scanner 
id is invalid. HTTP 410 (Gone) if the scanner is already closed or the lease 
time has expired.

One other the other should be updated to match 202 or 200

Billy


"Bryan Duxbury" <br...@rapleaf.com> wrote in 
message news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
> I've created an issue and submitted a patch to fix the problem.
> Billy, can you download the patch and check to see if it works alright?
>
> https://issues.apache.org/jira/browse/HADOOP-2504
>
> -Bryan
>
> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>
>> I checked and added the delete option to my code for the scanner
>> based on
>> the api from wiki but it looks like its not working at this time
>> basedo nthe
>> code and responce I got form the rest interfase. i get a "Not
>> hooked back up
>> yet" responce any idea on when this will be fixed?
>>
>> Thanks
>>
>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>> ScannerHandler.java
>>
>> public void doDelete(HttpServletRequest request, HttpServletResponse
>> response,
>> String[] pathSegments)
>> throws ServletException, IOException {
>> doMethodNotAllowed(response, "Not hooked back up yet!");
>> }
>>
>>
>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>> message 
>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>> Are you closing the scanners when you're done? If not, those might be
>>> hanging around for a long time. I don't think we've built in the
>>> proper
>>> timeout logic to make that work by itself.
>>>
>>> -Bryan
>>>
>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>
>>>> I was thanking the same thing and been running REST outside of the
>>>> Master on
>>>> each server for about 5 hours now and used the master as a
>>>> backup  if
>>>> local
>>>> rest interface failed. You are right I seen a little faster
>>>> processing
>>>> time
>>>> from doing this vs. using just the master.
>>>>
>>>> Seams the problem is not with the master its self looks like
>>>> REST  is
>>>> using
>>>> up more and more memory not sure but I thank its to do with inserts
>>>> maybe
>>>> not but the memory usage is going up I an doing a scanner 2 threads
>>>> reading
>>>> rows and processing the data and inserting it in to a separate table
>>>> building a inverted index.
>>>>
>>>> I will restart everything when this job is done and try to do just
>>>> inserts
>>>> and see if its the scanner or inserts.
>>>>
>>>> The master is holding at about 75mb and the rest interfaces are
>>>> up  to
>>>> 400MB
>>>> and slowly going up on the ones running the jobs.
>>>>
>>>> I am still testing I will see what else I can come up with.
>>>>
>>>> Billy
>>>>
>>>>
>>>> "stack" <st...@duboce.net> wrote in 
>>>> message
>>>> news:476C1AA8.3030306@duboce.net...
>>>>> Hey Billy:
>>>>>
>>>>> Master itself should use little memory and though it is not out
>>>>> of  the
>>>>> realm of possibiliites, it should not have a leak.
>>>>>
>>>>> Are you running with the default heap size?  You might want to
>>>>> give it
>>>>> more memory if you are (See
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>
>>>>> If you are uploading all via the REST server running on the
>>>>> master, the
>>>>> problem as you speculate, could be in the REST servlet itself
>>>>> (though
>>>>> it
>>>>> looks like it shouldn't be holding on to anything having given it a
>>>>> cursory glance).  You could try running the REST server
>>>>> independent of
>>>>> the
>>>>> master.  Grep for 'Starting the REST Server' in this page,
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>> (If  you
>>>>> are
>>>>> only running one REST instance, your upload might go faster if
>>>>> you  run
>>>>> multiple).
>>>>>
>>>>> St.Ack
>>>>>
>>>>>
>>>>> Billy wrote:
>>>>>> I forgot to say that once restart the master only uses about
>>>>>> 70mb of
>>>>>> memory
>>>>>>
>>>>>> Billy
>>>>>>
>>>>>> "Billy" <sa...@pearsonwholesale.com> 
>>>>>> wrote
>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>
>>>>>>> I not sure of this but why does the master server use up so much
>>>>>>> memory.
>>>>>>> I been running an script that been inserting data into a
>>>>>>> table  for a
>>>>>>> little over 24 hours and the master crashed because of
>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>
>>>>>>> So my question is why does the master use up so much memory
>>>>>>> at  most
>>>>>>> it
>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>> table
>>>>>>> mapping.
>>>>>>>
>>>>>>> Is it cache or a memory leak?
>>>>>>>
>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>
>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>> servers
>>>>>>> about
>>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>>> about
>>>>>>> 1GB.
>>>>>>>
>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>> 886.61
>>>>>>> MB and that's with
>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>> compressed
>>>>>>> at the block level.
>>>>>>> From what I understand the master should have lower memory
>>>>>>> and  cpu
>>>>>>> usage
>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>> keep up
>>>>>>> with all the data about the blocks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
> 




Re: hbase master heap space

Posted by Billy <sa...@pearsonwholesale.com>.
No problem I had the test script written so I could test in secs.

looks good here

Billy



"Bryan Duxbury" <br...@rapleaf.com> wrote in 
message news:475A4291-05EE-469F-B3BD-55759DAE2A8E@rapleaf.com...
>I posted another version of the patch that fixes this problem, I  think. 
>Give it another try?
>
> (Sorry for relying on you to do the testing - I figure you already  have 
> the framework set up, and I'm currently trapped in an airport.)
>
> -Bryan
>
> On Dec 29, 2007, at 8:44 PM, Billy wrote:
>
>> found this in the rest log i running rest outside of master and  logging 
>> it
>>
>> 07/12/29 22:36:00 WARN rest: /api/search_index/scanner/3977a5e4:
>> java.lang.ArrayIndexOutOfBoundsException: 3
>>         at
>> org.apache.hadoop.hbase.rest.ScannerHandler.doDelete 
>> (ScannerHandler.java:132)
>>         at
>> org.apache.hadoop.hbase.rest.Dispatcher.doDelete(Dispatcher.java:146)
>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java: 715)
>>         at javax.servlet.http.HttpServlet.service(HttpServlet.java: 802)
>>         at
>> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>>         at
>> org.mortbay.jetty.servlet.WebApplicationHandler.dispatch 
>> (WebApplicationHandler.java:475)
>>         at
>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java: 567)
>>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>>         at
>> org.mortbay.jetty.servlet.WebApplicationContext.handle 
>> (WebApplicationContext.java:635)
>>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>>         at org.mortbay.http.HttpConnection.service 
>> (HttpConnection.java:814)
>>         at
>> org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>>         at org.mortbay.http.HttpConnection.handle 
>> (HttpConnection.java:831)
>>         at
>> org.mortbay.http.SocketListener.handleConnection 
>> (SocketListener.java:244)
>>         at org.mortbay.util.ThreadedServer.handle 
>> (ThreadedServer.java:357)
>>         at org.mortbay.util.ThreadPool$PoolThread.run 
>> (ThreadPool.java:534)
>>
>>
>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>> message 
>> news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
>>> I've created an issue and submitted a patch to fix the problem.
>>> Billy, can you download the patch and check to see if it works  alright?
>>>
>>> https://issues.apache.org/jira/browse/HADOOP-2504
>>>
>>> -Bryan
>>>
>>> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>>>
>>>> I checked and added the delete option to my code for the scanner
>>>> based on
>>>> the api from wiki but it looks like its not working at this time
>>>> basedo nthe
>>>> code and responce I got form the rest interfase. i get a "Not
>>>> hooked back up
>>>> yet" responce any idea on when this will be fixed?
>>>>
>>>> Thanks
>>>>
>>>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>>>> ScannerHandler.java
>>>>
>>>> public void doDelete(HttpServletRequest request, HttpServletResponse
>>>> response,
>>>> String[] pathSegments)
>>>> throws ServletException, IOException {
>>>> doMethodNotAllowed(response, "Not hooked back up yet!");
>>>> }
>>>>
>>>>
>>>> "Bryan Duxbury" <br...@rapleaf.com> wrote 
>>>> in
>>>> message
>>>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>>>> Are you closing the scanners when you're done? If not, those  might be
>>>>> hanging around for a long time. I don't think we've built in the
>>>>> proper
>>>>> timeout logic to make that work by itself.
>>>>>
>>>>> -Bryan
>>>>>
>>>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>>>
>>>>>> I was thanking the same thing and been running REST outside of the
>>>>>> Master on
>>>>>> each server for about 5 hours now and used the master as a
>>>>>> backup  if
>>>>>> local
>>>>>> rest interface failed. You are right I seen a little faster
>>>>>> processing
>>>>>> time
>>>>>> from doing this vs. using just the master.
>>>>>>
>>>>>> Seams the problem is not with the master its self looks like
>>>>>> REST  is
>>>>>> using
>>>>>> up more and more memory not sure but I thank its to do with  inserts
>>>>>> maybe
>>>>>> not but the memory usage is going up I an doing a scanner 2  threads
>>>>>> reading
>>>>>> rows and processing the data and inserting it in to a separate  table
>>>>>> building a inverted index.
>>>>>>
>>>>>> I will restart everything when this job is done and try to do just
>>>>>> inserts
>>>>>> and see if its the scanner or inserts.
>>>>>>
>>>>>> The master is holding at about 75mb and the rest interfaces are
>>>>>> up  to
>>>>>> 400MB
>>>>>> and slowly going up on the ones running the jobs.
>>>>>>
>>>>>> I am still testing I will see what else I can come up with.
>>>>>>
>>>>>> Billy
>>>>>>
>>>>>>
>>>>>> "stack" <st...@duboce.net> wrote in
>>>>>> message
>>>>>> news:476C1AA8.3030306@duboce.net...
>>>>>>> Hey Billy:
>>>>>>>
>>>>>>> Master itself should use little memory and though it is not out
>>>>>>> of  the
>>>>>>> realm of possibiliites, it should not have a leak.
>>>>>>>
>>>>>>> Are you running with the default heap size?  You might want to
>>>>>>> give it
>>>>>>> more memory if you are (See
>>>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>>>
>>>>>>> If you are uploading all via the REST server running on the
>>>>>>> master, the
>>>>>>> problem as you speculate, could be in the REST servlet itself
>>>>>>> (though
>>>>>>> it
>>>>>>> looks like it shouldn't be holding on to anything having given  it a
>>>>>>> cursory glance).  You could try running the REST server
>>>>>>> independent of
>>>>>>> the
>>>>>>> master.  Grep for 'Starting the REST Server' in this page,
>>>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>>>> (If  you
>>>>>>> are
>>>>>>> only running one REST instance, your upload might go faster if
>>>>>>> you  run
>>>>>>> multiple).
>>>>>>>
>>>>>>> St.Ack
>>>>>>>
>>>>>>>
>>>>>>> Billy wrote:
>>>>>>>> I forgot to say that once restart the master only uses about
>>>>>>>> 70mb of
>>>>>>>> memory
>>>>>>>>
>>>>>>>> Billy
>>>>>>>>
>>>>>>>> "Billy" <sa...@pearsonwholesale.com>
>>>>>>>> wrote
>>>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>>>
>>>>>>>>> I not sure of this but why does the master server use up so  much
>>>>>>>>> memory.
>>>>>>>>> I been running an script that been inserting data into a
>>>>>>>>> table  for a
>>>>>>>>> little over 24 hours and the master crashed because of
>>>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>>>
>>>>>>>>> So my question is why does the master use up so much memory
>>>>>>>>> at  most
>>>>>>>>> it
>>>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>>>> table
>>>>>>>>> mapping.
>>>>>>>>>
>>>>>>>>> Is it cache or a memory leak?
>>>>>>>>>
>>>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>>>
>>>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>>>> servers
>>>>>>>>> about
>>>>>>>>> 51,932,760 edits and the master ran out of memory with a  heap of
>>>>>>>>> about
>>>>>>>>> 1GB.
>>>>>>>>>
>>>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>>>> 886.61
>>>>>>>>> MB and that's with
>>>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>>>> compressed
>>>>>>>>> at the block level.
>>>>>>>>> From what I understand the master should have lower memory
>>>>>>>>> and  cpu
>>>>>>>>> usage
>>>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>>>> keep up
>>>>>>>>> with all the data about the blocks.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
> 




Re: hbase master heap space

Posted by Bryan Duxbury <br...@rapleaf.com>.
I posted another version of the patch that fixes this problem, I  
think. Give it another try?

(Sorry for relying on you to do the testing - I figure you already  
have the framework set up, and I'm currently trapped in an airport.)

-Bryan

On Dec 29, 2007, at 8:44 PM, Billy wrote:

> found this in the rest log i running rest outside of master and  
> logging it
>
> 07/12/29 22:36:00 WARN rest: /api/search_index/scanner/3977a5e4:
> java.lang.ArrayIndexOutOfBoundsException: 3
>         at
> org.apache.hadoop.hbase.rest.ScannerHandler.doDelete 
> (ScannerHandler.java:132)
>         at
> org.apache.hadoop.hbase.rest.Dispatcher.doDelete(Dispatcher.java:146)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java: 
> 715)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java: 
> 802)
>         at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>         at
> org.mortbay.jetty.servlet.WebApplicationHandler.dispatch 
> (WebApplicationHandler.java:475)
>         at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java: 
> 567)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>         at
> org.mortbay.jetty.servlet.WebApplicationContext.handle 
> (WebApplicationContext.java:635)
>         at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>         at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>         at org.mortbay.http.HttpConnection.service 
> (HttpConnection.java:814)
>         at
> org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>         at org.mortbay.http.HttpConnection.handle 
> (HttpConnection.java:831)
>         at
> org.mortbay.http.SocketListener.handleConnection 
> (SocketListener.java:244)
>         at org.mortbay.util.ThreadedServer.handle 
> (ThreadedServer.java:357)
>         at org.mortbay.util.ThreadPool$PoolThread.run 
> (ThreadPool.java:534)
>
>
> "Bryan Duxbury" <br...@rapleaf.com> wrote in
> message news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
>> I've created an issue and submitted a patch to fix the problem.
>> Billy, can you download the patch and check to see if it works  
>> alright?
>>
>> https://issues.apache.org/jira/browse/HADOOP-2504
>>
>> -Bryan
>>
>> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>>
>>> I checked and added the delete option to my code for the scanner
>>> based on
>>> the api from wiki but it looks like its not working at this time
>>> basedo nthe
>>> code and responce I got form the rest interfase. i get a "Not
>>> hooked back up
>>> yet" responce any idea on when this will be fixed?
>>>
>>> Thanks
>>>
>>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>>> ScannerHandler.java
>>>
>>> public void doDelete(HttpServletRequest request, HttpServletResponse
>>> response,
>>> String[] pathSegments)
>>> throws ServletException, IOException {
>>> doMethodNotAllowed(response, "Not hooked back up yet!");
>>> }
>>>
>>>
>>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>>> message
>>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>>> Are you closing the scanners when you're done? If not, those  
>>>> might be
>>>> hanging around for a long time. I don't think we've built in the
>>>> proper
>>>> timeout logic to make that work by itself.
>>>>
>>>> -Bryan
>>>>
>>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>>
>>>>> I was thanking the same thing and been running REST outside of the
>>>>> Master on
>>>>> each server for about 5 hours now and used the master as a
>>>>> backup  if
>>>>> local
>>>>> rest interface failed. You are right I seen a little faster
>>>>> processing
>>>>> time
>>>>> from doing this vs. using just the master.
>>>>>
>>>>> Seams the problem is not with the master its self looks like
>>>>> REST  is
>>>>> using
>>>>> up more and more memory not sure but I thank its to do with  
>>>>> inserts
>>>>> maybe
>>>>> not but the memory usage is going up I an doing a scanner 2  
>>>>> threads
>>>>> reading
>>>>> rows and processing the data and inserting it in to a separate  
>>>>> table
>>>>> building a inverted index.
>>>>>
>>>>> I will restart everything when this job is done and try to do just
>>>>> inserts
>>>>> and see if its the scanner or inserts.
>>>>>
>>>>> The master is holding at about 75mb and the rest interfaces are
>>>>> up  to
>>>>> 400MB
>>>>> and slowly going up on the ones running the jobs.
>>>>>
>>>>> I am still testing I will see what else I can come up with.
>>>>>
>>>>> Billy
>>>>>
>>>>>
>>>>> "stack" <st...@duboce.net> wrote in
>>>>> message
>>>>> news:476C1AA8.3030306@duboce.net...
>>>>>> Hey Billy:
>>>>>>
>>>>>> Master itself should use little memory and though it is not out
>>>>>> of  the
>>>>>> realm of possibiliites, it should not have a leak.
>>>>>>
>>>>>> Are you running with the default heap size?  You might want to
>>>>>> give it
>>>>>> more memory if you are (See
>>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>>
>>>>>> If you are uploading all via the REST server running on the
>>>>>> master, the
>>>>>> problem as you speculate, could be in the REST servlet itself
>>>>>> (though
>>>>>> it
>>>>>> looks like it shouldn't be holding on to anything having given  
>>>>>> it a
>>>>>> cursory glance).  You could try running the REST server
>>>>>> independent of
>>>>>> the
>>>>>> master.  Grep for 'Starting the REST Server' in this page,
>>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>>> (If  you
>>>>>> are
>>>>>> only running one REST instance, your upload might go faster if
>>>>>> you  run
>>>>>> multiple).
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>>
>>>>>> Billy wrote:
>>>>>>> I forgot to say that once restart the master only uses about
>>>>>>> 70mb of
>>>>>>> memory
>>>>>>>
>>>>>>> Billy
>>>>>>>
>>>>>>> "Billy" <sa...@pearsonwholesale.com>
>>>>>>> wrote
>>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>>
>>>>>>>> I not sure of this but why does the master server use up so  
>>>>>>>> much
>>>>>>>> memory.
>>>>>>>> I been running an script that been inserting data into a
>>>>>>>> table  for a
>>>>>>>> little over 24 hours and the master crashed because of
>>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>>
>>>>>>>> So my question is why does the master use up so much memory
>>>>>>>> at  most
>>>>>>>> it
>>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>>> table
>>>>>>>> mapping.
>>>>>>>>
>>>>>>>> Is it cache or a memory leak?
>>>>>>>>
>>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>>
>>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>>> servers
>>>>>>>> about
>>>>>>>> 51,932,760 edits and the master ran out of memory with a  
>>>>>>>> heap of
>>>>>>>> about
>>>>>>>> 1GB.
>>>>>>>>
>>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>>> 886.61
>>>>>>>> MB and that's with
>>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>>> compressed
>>>>>>>> at the block level.
>>>>>>>> From what I understand the master should have lower memory
>>>>>>>> and  cpu
>>>>>>>> usage
>>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>>> keep up
>>>>>>>> with all the data about the blocks.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>


Re: hbase master heap space

Posted by Billy <sa...@pearsonwholesale.com>.
found this in the rest log i running rest outside of master and logging it

07/12/29 22:36:00 WARN rest: /api/search_index/scanner/3977a5e4:
java.lang.ArrayIndexOutOfBoundsException: 3
        at 
org.apache.hadoop.hbase.rest.ScannerHandler.doDelete(ScannerHandler.java:132)
        at 
org.apache.hadoop.hbase.rest.Dispatcher.doDelete(Dispatcher.java:146)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:715)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
        at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
        at 
org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
        at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
        at 
org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
        at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
        at org.mortbay.http.HttpServer.service(HttpServer.java:954)
        at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
        at 
org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
        at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
        at 
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
        at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
        at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)


"Bryan Duxbury" <br...@rapleaf.com> wrote in 
message news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
> I've created an issue and submitted a patch to fix the problem.
> Billy, can you download the patch and check to see if it works alright?
>
> https://issues.apache.org/jira/browse/HADOOP-2504
>
> -Bryan
>
> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>
>> I checked and added the delete option to my code for the scanner
>> based on
>> the api from wiki but it looks like its not working at this time
>> basedo nthe
>> code and responce I got form the rest interfase. i get a "Not
>> hooked back up
>> yet" responce any idea on when this will be fixed?
>>
>> Thanks
>>
>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>> ScannerHandler.java
>>
>> public void doDelete(HttpServletRequest request, HttpServletResponse
>> response,
>> String[] pathSegments)
>> throws ServletException, IOException {
>> doMethodNotAllowed(response, "Not hooked back up yet!");
>> }
>>
>>
>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>> message 
>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>> Are you closing the scanners when you're done? If not, those might be
>>> hanging around for a long time. I don't think we've built in the
>>> proper
>>> timeout logic to make that work by itself.
>>>
>>> -Bryan
>>>
>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>
>>>> I was thanking the same thing and been running REST outside of the
>>>> Master on
>>>> each server for about 5 hours now and used the master as a
>>>> backup  if
>>>> local
>>>> rest interface failed. You are right I seen a little faster
>>>> processing
>>>> time
>>>> from doing this vs. using just the master.
>>>>
>>>> Seams the problem is not with the master its self looks like
>>>> REST  is
>>>> using
>>>> up more and more memory not sure but I thank its to do with inserts
>>>> maybe
>>>> not but the memory usage is going up I an doing a scanner 2 threads
>>>> reading
>>>> rows and processing the data and inserting it in to a separate table
>>>> building a inverted index.
>>>>
>>>> I will restart everything when this job is done and try to do just
>>>> inserts
>>>> and see if its the scanner or inserts.
>>>>
>>>> The master is holding at about 75mb and the rest interfaces are
>>>> up  to
>>>> 400MB
>>>> and slowly going up on the ones running the jobs.
>>>>
>>>> I am still testing I will see what else I can come up with.
>>>>
>>>> Billy
>>>>
>>>>
>>>> "stack" <st...@duboce.net> wrote in 
>>>> message
>>>> news:476C1AA8.3030306@duboce.net...
>>>>> Hey Billy:
>>>>>
>>>>> Master itself should use little memory and though it is not out
>>>>> of  the
>>>>> realm of possibiliites, it should not have a leak.
>>>>>
>>>>> Are you running with the default heap size?  You might want to
>>>>> give it
>>>>> more memory if you are (See
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>
>>>>> If you are uploading all via the REST server running on the
>>>>> master, the
>>>>> problem as you speculate, could be in the REST servlet itself
>>>>> (though
>>>>> it
>>>>> looks like it shouldn't be holding on to anything having given it a
>>>>> cursory glance).  You could try running the REST server
>>>>> independent of
>>>>> the
>>>>> master.  Grep for 'Starting the REST Server' in this page,
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>> (If  you
>>>>> are
>>>>> only running one REST instance, your upload might go faster if
>>>>> you  run
>>>>> multiple).
>>>>>
>>>>> St.Ack
>>>>>
>>>>>
>>>>> Billy wrote:
>>>>>> I forgot to say that once restart the master only uses about
>>>>>> 70mb of
>>>>>> memory
>>>>>>
>>>>>> Billy
>>>>>>
>>>>>> "Billy" <sa...@pearsonwholesale.com> 
>>>>>> wrote
>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>
>>>>>>> I not sure of this but why does the master server use up so much
>>>>>>> memory.
>>>>>>> I been running an script that been inserting data into a
>>>>>>> table  for a
>>>>>>> little over 24 hours and the master crashed because of
>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>
>>>>>>> So my question is why does the master use up so much memory
>>>>>>> at  most
>>>>>>> it
>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>> table
>>>>>>> mapping.
>>>>>>>
>>>>>>> Is it cache or a memory leak?
>>>>>>>
>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>
>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>> servers
>>>>>>> about
>>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>>> about
>>>>>>> 1GB.
>>>>>>>
>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>> 886.61
>>>>>>> MB and that's with
>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>> compressed
>>>>>>> at the block level.
>>>>>>> From what I understand the master should have lower memory
>>>>>>> and  cpu
>>>>>>> usage
>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>> keep up
>>>>>>> with all the data about the blocks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
> 




Re: hbase master heap space

Posted by Bryan Duxbury <br...@rapleaf.com>.
I've created an issue and submitted a patch to fix the problem.  
Billy, can you download the patch and check to see if it works alright?

https://issues.apache.org/jira/browse/HADOOP-2504

-Bryan

On Dec 29, 2007, at 3:36 PM, Billy wrote:

> I checked and added the delete option to my code for the scanner  
> based on
> the api from wiki but it looks like its not working at this time  
> basedo nthe
> code and responce I got form the rest interfase. i get a "Not  
> hooked back up
> yet" responce any idea on when this will be fixed?
>
> Thanks
>
> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/ 
> ScannerHandler.java
>
> public void doDelete(HttpServletRequest request, HttpServletResponse
> response,
> String[] pathSegments)
> throws ServletException, IOException {
> doMethodNotAllowed(response, "Not hooked back up yet!");
> }
>
>
> "Bryan Duxbury" <br...@rapleaf.com> wrote in
> message news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>> Are you closing the scanners when you're done? If not, those might be
>> hanging around for a long time. I don't think we've built in the   
>> proper
>> timeout logic to make that work by itself.
>>
>> -Bryan
>>
>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>
>>> I was thanking the same thing and been running REST outside of the
>>> Master on
>>> each server for about 5 hours now and used the master as a  
>>> backup  if
>>> local
>>> rest interface failed. You are right I seen a little faster   
>>> processing
>>> time
>>> from doing this vs. using just the master.
>>>
>>> Seams the problem is not with the master its self looks like  
>>> REST  is
>>> using
>>> up more and more memory not sure but I thank its to do with inserts
>>> maybe
>>> not but the memory usage is going up I an doing a scanner 2 threads
>>> reading
>>> rows and processing the data and inserting it in to a separate table
>>> building a inverted index.
>>>
>>> I will restart everything when this job is done and try to do just
>>> inserts
>>> and see if its the scanner or inserts.
>>>
>>> The master is holding at about 75mb and the rest interfaces are  
>>> up  to
>>> 400MB
>>> and slowly going up on the ones running the jobs.
>>>
>>> I am still testing I will see what else I can come up with.
>>>
>>> Billy
>>>
>>>
>>> "stack" <st...@duboce.net> wrote in message
>>> news:476C1AA8.3030306@duboce.net...
>>>> Hey Billy:
>>>>
>>>> Master itself should use little memory and though it is not out  
>>>> of  the
>>>> realm of possibiliites, it should not have a leak.
>>>>
>>>> Are you running with the default heap size?  You might want to   
>>>> give it
>>>> more memory if you are (See
>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>
>>>> If you are uploading all via the REST server running on the   
>>>> master, the
>>>> problem as you speculate, could be in the REST servlet itself   
>>>> (though
>>>> it
>>>> looks like it shouldn't be holding on to anything having given it a
>>>> cursory glance).  You could try running the REST server   
>>>> independent of
>>>> the
>>>> master.  Grep for 'Starting the REST Server' in this page,
>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how  
>>>> (If  you
>>>> are
>>>> only running one REST instance, your upload might go faster if  
>>>> you  run
>>>> multiple).
>>>>
>>>> St.Ack
>>>>
>>>>
>>>> Billy wrote:
>>>>> I forgot to say that once restart the master only uses about  
>>>>> 70mb of
>>>>> memory
>>>>>
>>>>> Billy
>>>>>
>>>>> "Billy" <sa...@pearsonwholesale.com> wrote
>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>
>>>>>> I not sure of this but why does the master server use up so much
>>>>>> memory.
>>>>>> I been running an script that been inserting data into a  
>>>>>> table  for a
>>>>>> little over 24 hours and the master crashed because of
>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>
>>>>>> So my question is why does the master use up so much memory  
>>>>>> at  most
>>>>>> it
>>>>>> should store the -ROOT-,.META. tables in memory and block to  
>>>>>> table
>>>>>> mapping.
>>>>>>
>>>>>> Is it cache or a memory leak?
>>>>>>
>>>>>> I am using the rest interface so could that be the reason?
>>>>>>
>>>>>> I inserted according to the high edit ids on all the region  
>>>>>> servers
>>>>>> about
>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>> about
>>>>>> 1GB.
>>>>>>
>>>>>> The other side to this is the data I inserted is only taking up
>>>>>> 886.61
>>>>>> MB and that's with
>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>> compressed
>>>>>> at the block level.
>>>>>> From what I understand the master should have lower memory  
>>>>>> and  cpu
>>>>>> usage
>>>>>> and the namenode on hadoop should be the memory hog it has to   
>>>>>> keep up
>>>>>> with all the data about the blocks.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>


Re: hbase master heap space

Posted by Billy <sa...@pearsonwholesale.com>.
I checked and added the delete option to my code for the scanner based on 
the api from wiki but it looks like its not working at this time basedo nthe 
code and responce I got form the rest interfase. i get a "Not hooked back up 
yet" responce any idea on when this will be fixed?

Thanks

src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/ScannerHandler.java

public void doDelete(HttpServletRequest request, HttpServletResponse 
response,
String[] pathSegments)
throws ServletException, IOException {
doMethodNotAllowed(response, "Not hooked back up yet!");
}


"Bryan Duxbury" <br...@rapleaf.com> wrote in 
message news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
> Are you closing the scanners when you're done? If not, those might be 
> hanging around for a long time. I don't think we've built in the  proper 
> timeout logic to make that work by itself.
>
> -Bryan
>
> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>
>> I was thanking the same thing and been running REST outside of the 
>> Master on
>> each server for about 5 hours now and used the master as a backup  if 
>> local
>> rest interface failed. You are right I seen a little faster  processing 
>> time
>> from doing this vs. using just the master.
>>
>> Seams the problem is not with the master its self looks like REST  is 
>> using
>> up more and more memory not sure but I thank its to do with inserts 
>> maybe
>> not but the memory usage is going up I an doing a scanner 2 threads 
>> reading
>> rows and processing the data and inserting it in to a separate table
>> building a inverted index.
>>
>> I will restart everything when this job is done and try to do just 
>> inserts
>> and see if its the scanner or inserts.
>>
>> The master is holding at about 75mb and the rest interfaces are up  to 
>> 400MB
>> and slowly going up on the ones running the jobs.
>>
>> I am still testing I will see what else I can come up with.
>>
>> Billy
>>
>>
>> "stack" <st...@duboce.net> wrote in message
>> news:476C1AA8.3030306@duboce.net...
>>> Hey Billy:
>>>
>>> Master itself should use little memory and though it is not out of  the
>>> realm of possibiliites, it should not have a leak.
>>>
>>> Are you running with the default heap size?  You might want to  give it
>>> more memory if you are (See
>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>
>>> If you are uploading all via the REST server running on the  master, the
>>> problem as you speculate, could be in the REST servlet itself  (though 
>>> it
>>> looks like it shouldn't be holding on to anything having given it a
>>> cursory glance).  You could try running the REST server  independent of 
>>> the
>>> master.  Grep for 'Starting the REST Server' in this page,
>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If  you 
>>> are
>>> only running one REST instance, your upload might go faster if you  run
>>> multiple).
>>>
>>> St.Ack
>>>
>>>
>>> Billy wrote:
>>>> I forgot to say that once restart the master only uses about 70mb of
>>>> memory
>>>>
>>>> Billy
>>>>
>>>> "Billy" <sa...@pearsonwholesale.com> wrote
>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>
>>>>> I not sure of this but why does the master server use up so much 
>>>>> memory.
>>>>> I been running an script that been inserting data into a table  for a
>>>>> little over 24 hours and the master crashed because of
>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>
>>>>> So my question is why does the master use up so much memory at  most 
>>>>> it
>>>>> should store the -ROOT-,.META. tables in memory and block to table
>>>>> mapping.
>>>>>
>>>>> Is it cache or a memory leak?
>>>>>
>>>>> I am using the rest interface so could that be the reason?
>>>>>
>>>>> I inserted according to the high edit ids on all the region servers
>>>>> about
>>>>> 51,932,760 edits and the master ran out of memory with a heap of 
>>>>> about
>>>>> 1GB.
>>>>>
>>>>> The other side to this is the data I inserted is only taking up 
>>>>> 886.61
>>>>> MB and that's with
>>>>> dfs.replication set to 2 so half that is only 440MB of data 
>>>>> compressed
>>>>> at the block level.
>>>>> From what I understand the master should have lower memory and  cpu 
>>>>> usage
>>>>> and the namenode on hadoop should be the memory hog it has to  keep up
>>>>> with all the data about the blocks.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
> 




Re: hbase master heap space

Posted by Bryan Duxbury <br...@rapleaf.com>.
Are you closing the scanners when you're done? If not, those might be  
hanging around for a long time. I don't think we've built in the  
proper timeout logic to make that work by itself.

-Bryan

On Dec 21, 2007, at 5:10 PM, Billy wrote:

> I was thanking the same thing and been running REST outside of the  
> Master on
> each server for about 5 hours now and used the master as a backup  
> if local
> rest interface failed. You are right I seen a little faster  
> processing time
> from doing this vs. using just the master.
>
> Seams the problem is not with the master its self looks like REST  
> is using
> up more and more memory not sure but I thank its to do with inserts  
> maybe
> not but the memory usage is going up I an doing a scanner 2 threads  
> reading
> rows and processing the data and inserting it in to a separate table
> building a inverted index.
>
> I will restart everything when this job is done and try to do just  
> inserts
> and see if its the scanner or inserts.
>
> The master is holding at about 75mb and the rest interfaces are up  
> to 400MB
> and slowly going up on the ones running the jobs.
>
> I am still testing I will see what else I can come up with.
>
> Billy
>
>
> "stack" <st...@duboce.net> wrote in message
> news:476C1AA8.3030306@duboce.net...
>> Hey Billy:
>>
>> Master itself should use little memory and though it is not out of  
>> the
>> realm of possibiliites, it should not have a leak.
>>
>> Are you running with the default heap size?  You might want to  
>> give it
>> more memory if you are (See
>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>
>> If you are uploading all via the REST server running on the  
>> master, the
>> problem as you speculate, could be in the REST servlet itself  
>> (though it
>> looks like it shouldn't be holding on to anything having given it a
>> cursory glance).  You could try running the REST server  
>> independent of the
>> master.  Grep for 'Starting the REST Server' in this page,
>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If  
>> you are
>> only running one REST instance, your upload might go faster if you  
>> run
>> multiple).
>>
>> St.Ack
>>
>>
>> Billy wrote:
>>> I forgot to say that once restart the master only uses about 70mb of
>>> memory
>>>
>>> Billy
>>>
>>> "Billy" <sa...@pearsonwholesale.com> wrote
>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>
>>>> I not sure of this but why does the master server use up so much  
>>>> memory.
>>>> I been running an script that been inserting data into a table  
>>>> for a
>>>> little over 24 hours and the master crashed because of
>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>
>>>> So my question is why does the master use up so much memory at  
>>>> most it
>>>> should store the -ROOT-,.META. tables in memory and block to table
>>>> mapping.
>>>>
>>>> Is it cache or a memory leak?
>>>>
>>>> I am using the rest interface so could that be the reason?
>>>>
>>>> I inserted according to the high edit ids on all the region servers
>>>> about
>>>> 51,932,760 edits and the master ran out of memory with a heap of  
>>>> about
>>>> 1GB.
>>>>
>>>> The other side to this is the data I inserted is only taking up  
>>>> 886.61
>>>> MB and that's with
>>>> dfs.replication set to 2 so half that is only 440MB of data  
>>>> compressed
>>>> at the block level.
>>>> From what I understand the master should have lower memory and  
>>>> cpu usage
>>>> and the namenode on hadoop should be the memory hog it has to  
>>>> keep up
>>>> with all the data about the blocks.
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
>


Re: hbase master heap space

Posted by Billy <sa...@pearsonwholesale.com>.
I was thanking the same thing and been running REST outside of the Master on 
each server for about 5 hours now and used the master as a backup if local 
rest interface failed. You are right I seen a little faster processing time 
from doing this vs. using just the master.

Seams the problem is not with the master its self looks like REST is using 
up more and more memory not sure but I thank its to do with inserts maybe 
not but the memory usage is going up I an doing a scanner 2 threads reading 
rows and processing the data and inserting it in to a separate table 
building a inverted index.

I will restart everything when this job is done and try to do just inserts 
and see if its the scanner or inserts.

The master is holding at about 75mb and the rest interfaces are up to 400MB 
and slowly going up on the ones running the jobs.

I am still testing I will see what else I can come up with.

Billy


"stack" <st...@duboce.net> wrote in message 
news:476C1AA8.3030306@duboce.net...
> Hey Billy:
>
> Master itself should use little memory and though it is not out of the 
> realm of possibiliites, it should not have a leak.
>
> Are you running with the default heap size?  You might want to give it 
> more memory if you are (See 
> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>
> If you are uploading all via the REST server running on the master, the 
> problem as you speculate, could be in the REST servlet itself (though it 
> looks like it shouldn't be holding on to anything having given it a 
> cursory glance).  You could try running the REST server independent of the 
> master.  Grep for 'Starting the REST Server' in this page, 
> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If you are 
> only running one REST instance, your upload might go faster if you run 
> multiple).
>
> St.Ack
>
>
> Billy wrote:
>> I forgot to say that once restart the master only uses about 70mb of 
>> memory
>>
>> Billy
>>
>> "Billy" <sa...@pearsonwholesale.com> wrote 
>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>
>>> I not sure of this but why does the master server use up so much memory. 
>>> I been running an script that been inserting data into a table for a 
>>> little over 24 hours and the master crashed because of 
>>> java.lang.OutOfMemoryError: Java heap space.
>>>
>>> So my question is why does the master use up so much memory at most it 
>>> should store the -ROOT-,.META. tables in memory and block to table 
>>> mapping.
>>>
>>> Is it cache or a memory leak?
>>>
>>> I am using the rest interface so could that be the reason?
>>>
>>> I inserted according to the high edit ids on all the region servers 
>>> about
>>> 51,932,760 edits and the master ran out of memory with a heap of about 
>>> 1GB.
>>>
>>> The other side to this is the data I inserted is only taking up 886.61 
>>> MB and that's with
>>> dfs.replication set to 2 so half that is only 440MB of data compressed 
>>> at the block level.
>>> From what I understand the master should have lower memory and cpu usage 
>>> and the namenode on hadoop should be the memory hog it has to keep up 
>>> with all the data about the blocks.
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
> 




Re: hbase master heap space

Posted by stack <st...@duboce.net>.
Hey Billy:

Master itself should use little memory and though it is not out of the 
realm of possibiliites, it should not have a leak.

Are you running with the default heap size?  You might want to give it 
more memory if you are (See 
http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).

If you are uploading all via the REST server running on the master, the 
problem as you speculate, could be in the REST servlet itself (though it 
looks like it shouldn't be holding on to anything having given it a 
cursory glance).  You could try running the REST server independent of 
the master.  Grep for 'Starting the REST Server' in this page, 
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If you 
are only running one REST instance, your upload might go faster if you 
run multiple).

St.Ack


Billy wrote:
> I forgot to say that once restart the master only uses about 70mb of memory
>
> Billy
>
> "Billy" <sa...@pearsonwholesale.com> wrote in 
> message news:fkejpo$u8c$1@ger.gmane.org...
>   
>> I not sure of this but why does the master server use up so much memory. I 
>> been running an script that been inserting data into a table for a little 
>> over 24 hours and the master crashed because of java.lang.OutOfMemoryError: 
>> Java heap space.
>>
>> So my question is why does the master use up so much memory at most it 
>> should store the -ROOT-,.META. tables in memory and block to table 
>> mapping.
>>
>> Is it cache or a memory leak?
>>
>> I am using the rest interface so could that be the reason?
>>
>> I inserted according to the high edit ids on all the region servers about
>> 51,932,760 edits and the master ran out of memory with a heap of about 
>> 1GB.
>>
>> The other side to this is the data I inserted is only taking up 886.61 MB 
>> and that's with
>> dfs.replication set to 2 so half that is only 440MB of data compressed at 
>> the block level.
>> From what I understand the master should have lower memory and cpu usage 
>> and the namenode on hadoop should be the memory hog it has to keep up with 
>> all the data about the blocks.
>>
>>
>>
>>     
>
>
>
>   


Re: hbase master heap space

Posted by Billy <sa...@pearsonwholesale.com>.
I forgot to say that once restart the master only uses about 70mb of memory

Billy

"Billy" <sa...@pearsonwholesale.com> wrote in 
message news:fkejpo$u8c$1@ger.gmane.org...
>I not sure of this but why does the master server use up so much memory. I 
>been running an script that been inserting data into a table for a little 
>over 24 hours and the master crashed because of java.lang.OutOfMemoryError: 
>Java heap space.
>
> So my question is why does the master use up so much memory at most it 
> should store the -ROOT-,.META. tables in memory and block to table 
> mapping.
>
> Is it cache or a memory leak?
>
> I am using the rest interface so could that be the reason?
>
> I inserted according to the high edit ids on all the region servers about
> 51,932,760 edits and the master ran out of memory with a heap of about 
> 1GB.
>
> The other side to this is the data I inserted is only taking up 886.61 MB 
> and that's with
> dfs.replication set to 2 so half that is only 440MB of data compressed at 
> the block level.
> From what I understand the master should have lower memory and cpu usage 
> and the namenode on hadoop should be the memory hog it has to keep up with 
> all the data about the blocks.
>
>
>