You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Billy <sa...@pearsonwholesale.com> on 2007/12/20 21:37:08 UTC
hbase master heap space
I not sure of this but why does the master server use up so much memory. I
been running an script that been inserting data into a table for a little
over 24 hours and the master crashed because of java.lang.OutOfMemoryError:
Java heap space.
So my question is why does the master use up so much memory at most it
should store the -ROOT-,.META. tables in memory and block to table mapping.
Is it cache or a memory leak?
I am using the rest interface so could that be the reason?
I inserted according to the high edit ids on all the region servers about
51,932,760 edits and the master ran out of memory with a heap of about 1GB.
The other side to this is the data I inserted is only taking up 886.61 MB
and that's with
dfs.replication set to 2 so half that is only 440MB of data compressed at
the block level.
>From what I understand the master should have lower memory and cpu usage and
the namenode on hadoop should be the memory hog it has to keep up with all
the data about the blocks.
Re: hbase master heap space
Posted by Billy <sa...@pearsonwholesale.com>.
any ideas on what might be causing the memory usages?
Billy
"Billy" <sa...@pearsonwholesale.com> wrote in
message news:fki4cm$nu5$1@ger.gmane.org...
>I tested it with out scanners and just inserting data casue the memory
>useage to rise and never recover from what I seen.
>
> I submitted a job that download web pages striped out the data needed and
> inserted it in to the table via php to REST. i used a trext file for the
> input so no reads to the table. table splits where never more then 15 so
> cached meta if any should not be the problem.
>
> a copy of the insert php function code I use to insert data is below
>
> But basicly I open a socket connection to the REST interface
> send send this:
>
> fputs( $fp, "PUT /api/".$table."/row/".$row."/ HTTP/1.1\r\n");
> fputs( $fp, "Host: ".$master_ip."\r\n");
> fputs( $fp, "Content-type: text/xml\r\n" );
> fputs( $fp, "Content-length: ".strlen($xml)."\r\n");
> fputs( $fp, "Connection: close\r\n\r\n");
> fputs( $fp, $xml."\r\n\r\n");
>
> Then get the returning data from the socket and close the connection.
>
>
> The REST interface starts out with about 38MB used in memory then climbs.
> I have not let it crash with a copy running out side of the master but it
> did run out of memory while using the masters REST Interface. It takes a
> lot of transactions to use up 1gb of memory I checked each table server
> and the sum of the edit ids on a new install was about 51 million using
> just the master as the interface.
>
> This causes me to have to kill and restart the proc to recover memory from
> time to time to keep it from crashing.
> Over all speed remains the same for transactions sec and I do not see any
> other problems that I can tell.
>
> {PHP Code Start}
> <?
> function hbase_insert($master_ip,$table,$row,$col,$data){
> //echo "row:".$row." - col:".$col." - data:".$data."\n";
> // make all arrays
> if (!is_array($col)) {
> $column[0] = $col;
> } else {
> $column = $col;
> } // end if
> unset($col);
> if (!is_array($data)){
> $adata[0] = $data;
> } else {
> $adata = $data;
> } // end if
> unset($data);
> // loop col array building xml to submit
> $xml = '<?xml version="1.0" encoding="UTF-8"?><row>';
> for ($count=count($column), $zz=0; $zz<$count; $zz++){
> //make sure the col has a : on the end if its not a child
> if (!ereg(":",$column[$zz])){
> $column[$zz] = $column[$zz].":";
> } // end if
> //append each column to the xml filed
> $xml .=
> '<column><name>'.$column[$zz].'</name><value>'.base64_encode($adata[$zz]).'</value></column>';
> } // end for
> $xml .= '</row>';
> //echo $xml,"\n";
> $fp = hbase_connect($master_ip);
> if (!$fp){
> return "failed";
> } // endif
> fputs( $fp, "PUT /api/".$table."/row/".$row."/ HTTP/1.1\r\n");
> fputs( $fp, "Host: ".$master_ip."\r\n");
> fputs( $fp, "Content-type: text/xml\r\n" );
> fputs( $fp, "Content-length: ".strlen($xml)."\r\n");
> fputs( $fp, "Connection: close\r\n\r\n");
> fputs( $fp, $xml."\r\n\r\n");
>
> //loop through the response from the server
> $buff = "";
> while(!feof($fp)){
> $buff .= fgets($fp, 1024);
> } // end while
> fclose($fp);
> if (!ereg("HTTP/1.1 200 OK",$buff)){
> return $buff;
> } else {
> return "success";
> } // end if
> } // end function
>
> function hbase_connect($master_ip){
> $fp = fsockopen("127.0.0.1", "60050", $errno, $errstr, $timeout = 10);
> if ($fp){
> echo "Localhost\n";
> return $fp;
> } else {
> $fp = fsockopen($master_ip, "60010", $errno, $errstr, $timeout = 10);
> if ($fp){
> echo "Master\n";
> return $fp;
> } else {
> return -1;
> } // end if
> } // end if
> } // end function
>
> ?>
> {PHP Code End}
>
> "Bryan Duxbury" <br...@rapleaf.com> wrote in
> message
> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>> Are you closing the scanners when you're done? If not, those might be
>> hanging around for a long time. I don't think we've built in the proper
>> timeout logic to make that work by itself.
>>
>> -Bryan
>>
>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>
>>> I was thanking the same thing and been running REST outside of the
>>> Master on
>>> each server for about 5 hours now and used the master as a backup if
>>> local
>>> rest interface failed. You are right I seen a little faster processing
>>> time
>>> from doing this vs. using just the master.
>>>
>>> Seams the problem is not with the master its self looks like REST is
>>> using
>>> up more and more memory not sure but I thank its to do with inserts
>>> maybe
>>> not but the memory usage is going up I an doing a scanner 2 threads
>>> reading
>>> rows and processing the data and inserting it in to a separate table
>>> building a inverted index.
>>>
>>> I will restart everything when this job is done and try to do just
>>> inserts
>>> and see if its the scanner or inserts.
>>>
>>> The master is holding at about 75mb and the rest interfaces are up to
>>> 400MB
>>> and slowly going up on the ones running the jobs.
>>>
>>> I am still testing I will see what else I can come up with.
>>>
>>> Billy
>>>
>>>
>>> "stack" <st...@duboce.net> wrote in message
>>> news:476C1AA8.3030306@duboce.net...
>>>> Hey Billy:
>>>>
>>>> Master itself should use little memory and though it is not out of the
>>>> realm of possibiliites, it should not have a leak.
>>>>
>>>> Are you running with the default heap size? You might want to give it
>>>> more memory if you are (See
>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>
>>>> If you are uploading all via the REST server running on the master,
>>>> the
>>>> problem as you speculate, could be in the REST servlet itself (though
>>>> it
>>>> looks like it shouldn't be holding on to anything having given it a
>>>> cursory glance). You could try running the REST server independent of
>>>> the
>>>> master. Grep for 'Starting the REST Server' in this page,
>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If you
>>>> are
>>>> only running one REST instance, your upload might go faster if you run
>>>> multiple).
>>>>
>>>> St.Ack
>>>>
>>>>
>>>> Billy wrote:
>>>>> I forgot to say that once restart the master only uses about 70mb of
>>>>> memory
>>>>>
>>>>> Billy
>>>>>
>>>>> "Billy" <sa...@pearsonwholesale.com>
>>>>> wrote
>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>
>>>>>> I not sure of this but why does the master server use up so much
>>>>>> memory.
>>>>>> I been running an script that been inserting data into a table for a
>>>>>> little over 24 hours and the master crashed because of
>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>
>>>>>> So my question is why does the master use up so much memory at most
>>>>>> it
>>>>>> should store the -ROOT-,.META. tables in memory and block to table
>>>>>> mapping.
>>>>>>
>>>>>> Is it cache or a memory leak?
>>>>>>
>>>>>> I am using the rest interface so could that be the reason?
>>>>>>
>>>>>> I inserted according to the high edit ids on all the region servers
>>>>>> about
>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>> about
>>>>>> 1GB.
>>>>>>
>>>>>> The other side to this is the data I inserted is only taking up
>>>>>> 886.61
>>>>>> MB and that's with
>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>> compressed
>>>>>> at the block level.
>>>>>> From what I understand the master should have lower memory and cpu
>>>>>> usage
>>>>>> and the namenode on hadoop should be the memory hog it has to keep
>>>>>> up
>>>>>> with all the data about the blocks.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>
>
Re: hbase master heap space
Posted by Billy <sa...@pearsonwholesale.com>.
I tested it with out scanners and just inserting data casue the memory
useage to rise and never recover from what I seen.
I submitted a job that download web pages striped out the data needed and
inserted it in to the table via php to REST. i used a trext file for the
input so no reads to the table. table splits where never more then 15 so
cached meta if any should not be the problem.
a copy of the insert php function code I use to insert data is below
But basicly I open a socket connection to the REST interface
send send this:
fputs( $fp, "PUT /api/".$table."/row/".$row."/ HTTP/1.1\r\n");
fputs( $fp, "Host: ".$master_ip."\r\n");
fputs( $fp, "Content-type: text/xml\r\n" );
fputs( $fp, "Content-length: ".strlen($xml)."\r\n");
fputs( $fp, "Connection: close\r\n\r\n");
fputs( $fp, $xml."\r\n\r\n");
Then get the returning data from the socket and close the connection.
The REST interface starts out with about 38MB used in memory then climbs. I
have not let it crash with a copy running out side of the master but it did
run out of memory while using the masters REST Interface. It takes a lot of
transactions to use up 1gb of memory I checked each table server and the sum
of the edit ids on a new install was about 51 million using just the master
as the interface.
This causes me to have to kill and restart the proc to recover memory from
time to time to keep it from crashing.
Over all speed remains the same for transactions sec and I do not see any
other problems that I can tell.
{PHP Code Start}
<?
function hbase_insert($master_ip,$table,$row,$col,$data){
//echo "row:".$row." - col:".$col." - data:".$data."\n";
// make all arrays
if (!is_array($col)) {
$column[0] = $col;
} else {
$column = $col;
} // end if
unset($col);
if (!is_array($data)){
$adata[0] = $data;
} else {
$adata = $data;
} // end if
unset($data);
// loop col array building xml to submit
$xml = '<?xml version="1.0" encoding="UTF-8"?><row>';
for ($count=count($column), $zz=0; $zz<$count; $zz++){
//make sure the col has a : on the end if its not a child
if (!ereg(":",$column[$zz])){
$column[$zz] = $column[$zz].":";
} // end if
//append each column to the xml filed
$xml .=
'<column><name>'.$column[$zz].'</name><value>'.base64_encode($adata[$zz]).'</value></column>';
} // end for
$xml .= '</row>';
//echo $xml,"\n";
$fp = hbase_connect($master_ip);
if (!$fp){
return "failed";
} // endif
fputs( $fp, "PUT /api/".$table."/row/".$row."/ HTTP/1.1\r\n");
fputs( $fp, "Host: ".$master_ip."\r\n");
fputs( $fp, "Content-type: text/xml\r\n" );
fputs( $fp, "Content-length: ".strlen($xml)."\r\n");
fputs( $fp, "Connection: close\r\n\r\n");
fputs( $fp, $xml."\r\n\r\n");
//loop through the response from the server
$buff = "";
while(!feof($fp)){
$buff .= fgets($fp, 1024);
} // end while
fclose($fp);
if (!ereg("HTTP/1.1 200 OK",$buff)){
return $buff;
} else {
return "success";
} // end if
} // end function
function hbase_connect($master_ip){
$fp = fsockopen("127.0.0.1", "60050", $errno, $errstr, $timeout = 10);
if ($fp){
echo "Localhost\n";
return $fp;
} else {
$fp = fsockopen($master_ip, "60010", $errno, $errstr, $timeout = 10);
if ($fp){
echo "Master\n";
return $fp;
} else {
return -1;
} // end if
} // end if
} // end function
?>
{PHP Code End}
"Bryan Duxbury" <br...@rapleaf.com> wrote in
message news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
> Are you closing the scanners when you're done? If not, those might be
> hanging around for a long time. I don't think we've built in the proper
> timeout logic to make that work by itself.
>
> -Bryan
>
> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>
>> I was thanking the same thing and been running REST outside of the
>> Master on
>> each server for about 5 hours now and used the master as a backup if
>> local
>> rest interface failed. You are right I seen a little faster processing
>> time
>> from doing this vs. using just the master.
>>
>> Seams the problem is not with the master its self looks like REST is
>> using
>> up more and more memory not sure but I thank its to do with inserts
>> maybe
>> not but the memory usage is going up I an doing a scanner 2 threads
>> reading
>> rows and processing the data and inserting it in to a separate table
>> building a inverted index.
>>
>> I will restart everything when this job is done and try to do just
>> inserts
>> and see if its the scanner or inserts.
>>
>> The master is holding at about 75mb and the rest interfaces are up to
>> 400MB
>> and slowly going up on the ones running the jobs.
>>
>> I am still testing I will see what else I can come up with.
>>
>> Billy
>>
>>
>> "stack" <st...@duboce.net> wrote in message
>> news:476C1AA8.3030306@duboce.net...
>>> Hey Billy:
>>>
>>> Master itself should use little memory and though it is not out of the
>>> realm of possibiliites, it should not have a leak.
>>>
>>> Are you running with the default heap size? You might want to give it
>>> more memory if you are (See
>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>
>>> If you are uploading all via the REST server running on the master, the
>>> problem as you speculate, could be in the REST servlet itself (though
>>> it
>>> looks like it shouldn't be holding on to anything having given it a
>>> cursory glance). You could try running the REST server independent of
>>> the
>>> master. Grep for 'Starting the REST Server' in this page,
>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If you
>>> are
>>> only running one REST instance, your upload might go faster if you run
>>> multiple).
>>>
>>> St.Ack
>>>
>>>
>>> Billy wrote:
>>>> I forgot to say that once restart the master only uses about 70mb of
>>>> memory
>>>>
>>>> Billy
>>>>
>>>> "Billy" <sa...@pearsonwholesale.com> wrote
>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>
>>>>> I not sure of this but why does the master server use up so much
>>>>> memory.
>>>>> I been running an script that been inserting data into a table for a
>>>>> little over 24 hours and the master crashed because of
>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>
>>>>> So my question is why does the master use up so much memory at most
>>>>> it
>>>>> should store the -ROOT-,.META. tables in memory and block to table
>>>>> mapping.
>>>>>
>>>>> Is it cache or a memory leak?
>>>>>
>>>>> I am using the rest interface so could that be the reason?
>>>>>
>>>>> I inserted according to the high edit ids on all the region servers
>>>>> about
>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>> about
>>>>> 1GB.
>>>>>
>>>>> The other side to this is the data I inserted is only taking up
>>>>> 886.61
>>>>> MB and that's with
>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>> compressed
>>>>> at the block level.
>>>>> From what I understand the master should have lower memory and cpu
>>>>> usage
>>>>> and the namenode on hadoop should be the memory hog it has to keep up
>>>>> with all the data about the blocks.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
RE: hbase master heap space
Posted by Jim Kellerman <ji...@powerset.com>.
Scanners time out on the region server side and resources get cleaned
up, but that does not happen on the client side unless you later call
the scanner again and the region server tells the client that that
scanner has timed out. In short, any application that uses a scanner
should close it. It might be a good idea to add a scanner watcher
on the client that shuts them down.
---
Jim Kellerman, Senior Engineer; Powerset
> -----Original Message-----
> From: Bryan Duxbury [mailto:bryan@rapleaf.com]
> Sent: Friday, December 21, 2007 5:51 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: hbase master heap space
>
> Are you closing the scanners when you're done? If not, those
> might be hanging around for a long time. I don't think we've
> built in the proper timeout logic to make that work by itself.
>
> -Bryan
>
> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>
> > I was thanking the same thing and been running REST outside of the
> > Master on each server for about 5 hours now and used the
> master as a
> > backup if local rest interface failed. You are right I seen
> a little
> > faster processing time from doing this vs. using just the master.
> >
> > Seams the problem is not with the master its self looks
> like REST is
> > using up more and more memory not sure but I thank its to do with
> > inserts maybe not but the memory usage is going up I an doing a
> > scanner 2 threads reading rows and processing the data and
> inserting
> > it in to a separate table building a inverted index.
> >
> > I will restart everything when this job is done and try to do just
> > inserts and see if its the scanner or inserts.
> >
> > The master is holding at about 75mb and the rest interfaces
> are up to
> > 400MB and slowly going up on the ones running the jobs.
> >
> > I am still testing I will see what else I can come up with.
> >
> > Billy
> >
> >
> > "stack" <st...@duboce.net> wrote in message
> > news:476C1AA8.3030306@duboce.net...
> >> Hey Billy:
> >>
> >> Master itself should use little memory and though it is not out of
> >> the realm of possibiliites, it should not have a leak.
> >>
> >> Are you running with the default heap size? You might
> want to give
> >> it more memory if you are (See
> >> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
> >>
> >> If you are uploading all via the REST server running on
> the master,
> >> the problem as you speculate, could be in the REST servlet itself
> >> (though it looks like it shouldn't be holding on to
> anything having
> >> given it a cursory glance). You could try running the REST server
> >> independent of the master. Grep for 'Starting the REST Server' in
> >> this page,
> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for
> >> how (If you are only running one REST instance, your
> upload might go
> >> faster if you run multiple).
> >>
> >> St.Ack
> >>
> >>
> >> Billy wrote:
> >>> I forgot to say that once restart the master only uses
> about 70mb of
> >>> memory
> >>>
> >>> Billy
> >>>
> >>> "Billy" <sa...@pearsonwholesale.com> wrote in message
> >>> news:fkejpo$u8c$1@ger.gmane.org...
> >>>
> >>>> I not sure of this but why does the master server use up so much
> >>>> memory.
> >>>> I been running an script that been inserting data into a
> table for
> >>>> a little over 24 hours and the master crashed because of
> >>>> java.lang.OutOfMemoryError: Java heap space.
> >>>>
> >>>> So my question is why does the master use up so much
> memory at most
> >>>> it should store the -ROOT-,.META. tables in memory and block to
> >>>> table mapping.
> >>>>
> >>>> Is it cache or a memory leak?
> >>>>
> >>>> I am using the rest interface so could that be the reason?
> >>>>
> >>>> I inserted according to the high edit ids on all the
> region servers
> >>>> about 51,932,760 edits and the master ran out of memory
> with a heap
> >>>> of about 1GB.
> >>>>
> >>>> The other side to this is the data I inserted is only taking up
> >>>> 886.61
> >>>> MB and that's with
> >>>> dfs.replication set to 2 so half that is only 440MB of data
> >>>> compressed at the block level.
> >>>> From what I understand the master should have lower
> memory and cpu
> >>>> usage and the namenode on hadoop should be the memory
> hog it has to
> >>>> keep up with all the data about the blocks.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>>
> >>
> >>
> >
> >
> >
>
>
Re: hbase master heap space
Posted by Billy <sa...@pearsonwholesale.com>.
I applied the patch to my truck version 605657
but here what I am getting
called for results: column from my search_index table
PUT /api/search_index/scanner?column=results:
Returned location:
/api/search_index/scanner/3977a5e4
called
DELETE /api/search_index/scanner/3977a5e4
returned:
HTTP/1.1 500 3
Date: Sun, 30 Dec 2007 04:36:00 GMT
Server: Jetty/5.1.4 (Linux/2.6.9-67.0.1.ELsmp i386 java/1.5.0_12
Connection: close
Content-Type: text/html
Content-Length: 1230
<html>
<head>
<title>Error 500 3</title>
</head>
<body>
<h2>HTTP ERROR: 500</h2><pre>3</pre>
<p>RequestURI=/api/search_index/scanner/3977a5e4</p>
<p><i><small><a href="http://jetty.mortbay.org">Powered by
Jetty://</a></small></i></p>
Billy
"Bryan Duxbury" <br...@rapleaf.com> wrote in
message news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
> I've created an issue and submitted a patch to fix the problem.
> Billy, can you download the patch and check to see if it works alright?
>
> https://issues.apache.org/jira/browse/HADOOP-2504
>
> -Bryan
>
> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>
>> I checked and added the delete option to my code for the scanner
>> based on
>> the api from wiki but it looks like its not working at this time
>> basedo nthe
>> code and responce I got form the rest interfase. i get a "Not
>> hooked back up
>> yet" responce any idea on when this will be fixed?
>>
>> Thanks
>>
>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>> ScannerHandler.java
>>
>> public void doDelete(HttpServletRequest request, HttpServletResponse
>> response,
>> String[] pathSegments)
>> throws ServletException, IOException {
>> doMethodNotAllowed(response, "Not hooked back up yet!");
>> }
>>
>>
>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>> message
>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>> Are you closing the scanners when you're done? If not, those might be
>>> hanging around for a long time. I don't think we've built in the
>>> proper
>>> timeout logic to make that work by itself.
>>>
>>> -Bryan
>>>
>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>
>>>> I was thanking the same thing and been running REST outside of the
>>>> Master on
>>>> each server for about 5 hours now and used the master as a
>>>> backup if
>>>> local
>>>> rest interface failed. You are right I seen a little faster
>>>> processing
>>>> time
>>>> from doing this vs. using just the master.
>>>>
>>>> Seams the problem is not with the master its self looks like
>>>> REST is
>>>> using
>>>> up more and more memory not sure but I thank its to do with inserts
>>>> maybe
>>>> not but the memory usage is going up I an doing a scanner 2 threads
>>>> reading
>>>> rows and processing the data and inserting it in to a separate table
>>>> building a inverted index.
>>>>
>>>> I will restart everything when this job is done and try to do just
>>>> inserts
>>>> and see if its the scanner or inserts.
>>>>
>>>> The master is holding at about 75mb and the rest interfaces are
>>>> up to
>>>> 400MB
>>>> and slowly going up on the ones running the jobs.
>>>>
>>>> I am still testing I will see what else I can come up with.
>>>>
>>>> Billy
>>>>
>>>>
>>>> "stack" <st...@duboce.net> wrote in
>>>> message
>>>> news:476C1AA8.3030306@duboce.net...
>>>>> Hey Billy:
>>>>>
>>>>> Master itself should use little memory and though it is not out
>>>>> of the
>>>>> realm of possibiliites, it should not have a leak.
>>>>>
>>>>> Are you running with the default heap size? You might want to
>>>>> give it
>>>>> more memory if you are (See
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>
>>>>> If you are uploading all via the REST server running on the
>>>>> master, the
>>>>> problem as you speculate, could be in the REST servlet itself
>>>>> (though
>>>>> it
>>>>> looks like it shouldn't be holding on to anything having given it a
>>>>> cursory glance). You could try running the REST server
>>>>> independent of
>>>>> the
>>>>> master. Grep for 'Starting the REST Server' in this page,
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>> (If you
>>>>> are
>>>>> only running one REST instance, your upload might go faster if
>>>>> you run
>>>>> multiple).
>>>>>
>>>>> St.Ack
>>>>>
>>>>>
>>>>> Billy wrote:
>>>>>> I forgot to say that once restart the master only uses about
>>>>>> 70mb of
>>>>>> memory
>>>>>>
>>>>>> Billy
>>>>>>
>>>>>> "Billy" <sa...@pearsonwholesale.com>
>>>>>> wrote
>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>
>>>>>>> I not sure of this but why does the master server use up so much
>>>>>>> memory.
>>>>>>> I been running an script that been inserting data into a
>>>>>>> table for a
>>>>>>> little over 24 hours and the master crashed because of
>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>
>>>>>>> So my question is why does the master use up so much memory
>>>>>>> at most
>>>>>>> it
>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>> table
>>>>>>> mapping.
>>>>>>>
>>>>>>> Is it cache or a memory leak?
>>>>>>>
>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>
>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>> servers
>>>>>>> about
>>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>>> about
>>>>>>> 1GB.
>>>>>>>
>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>> 886.61
>>>>>>> MB and that's with
>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>> compressed
>>>>>>> at the block level.
>>>>>>> From what I understand the master should have lower memory
>>>>>>> and cpu
>>>>>>> usage
>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>> keep up
>>>>>>> with all the data about the blocks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Re: hbase master heap space
Posted by Billy <sa...@pearsonwholesale.com>.
also noticed that's the return code should be 200 from the code but on wiki
it shows it to be
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest
HTTP 202 (Accepted) if it can be closed. HTTP 404 (Not Found) if the scanner
id is invalid. HTTP 410 (Gone) if the scanner is already closed or the lease
time has expired.
One other the other should be updated to match 202 or 200
Billy
"Bryan Duxbury" <br...@rapleaf.com> wrote in
message news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
> I've created an issue and submitted a patch to fix the problem.
> Billy, can you download the patch and check to see if it works alright?
>
> https://issues.apache.org/jira/browse/HADOOP-2504
>
> -Bryan
>
> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>
>> I checked and added the delete option to my code for the scanner
>> based on
>> the api from wiki but it looks like its not working at this time
>> basedo nthe
>> code and responce I got form the rest interfase. i get a "Not
>> hooked back up
>> yet" responce any idea on when this will be fixed?
>>
>> Thanks
>>
>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>> ScannerHandler.java
>>
>> public void doDelete(HttpServletRequest request, HttpServletResponse
>> response,
>> String[] pathSegments)
>> throws ServletException, IOException {
>> doMethodNotAllowed(response, "Not hooked back up yet!");
>> }
>>
>>
>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>> message
>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>> Are you closing the scanners when you're done? If not, those might be
>>> hanging around for a long time. I don't think we've built in the
>>> proper
>>> timeout logic to make that work by itself.
>>>
>>> -Bryan
>>>
>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>
>>>> I was thanking the same thing and been running REST outside of the
>>>> Master on
>>>> each server for about 5 hours now and used the master as a
>>>> backup if
>>>> local
>>>> rest interface failed. You are right I seen a little faster
>>>> processing
>>>> time
>>>> from doing this vs. using just the master.
>>>>
>>>> Seams the problem is not with the master its self looks like
>>>> REST is
>>>> using
>>>> up more and more memory not sure but I thank its to do with inserts
>>>> maybe
>>>> not but the memory usage is going up I an doing a scanner 2 threads
>>>> reading
>>>> rows and processing the data and inserting it in to a separate table
>>>> building a inverted index.
>>>>
>>>> I will restart everything when this job is done and try to do just
>>>> inserts
>>>> and see if its the scanner or inserts.
>>>>
>>>> The master is holding at about 75mb and the rest interfaces are
>>>> up to
>>>> 400MB
>>>> and slowly going up on the ones running the jobs.
>>>>
>>>> I am still testing I will see what else I can come up with.
>>>>
>>>> Billy
>>>>
>>>>
>>>> "stack" <st...@duboce.net> wrote in
>>>> message
>>>> news:476C1AA8.3030306@duboce.net...
>>>>> Hey Billy:
>>>>>
>>>>> Master itself should use little memory and though it is not out
>>>>> of the
>>>>> realm of possibiliites, it should not have a leak.
>>>>>
>>>>> Are you running with the default heap size? You might want to
>>>>> give it
>>>>> more memory if you are (See
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>
>>>>> If you are uploading all via the REST server running on the
>>>>> master, the
>>>>> problem as you speculate, could be in the REST servlet itself
>>>>> (though
>>>>> it
>>>>> looks like it shouldn't be holding on to anything having given it a
>>>>> cursory glance). You could try running the REST server
>>>>> independent of
>>>>> the
>>>>> master. Grep for 'Starting the REST Server' in this page,
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>> (If you
>>>>> are
>>>>> only running one REST instance, your upload might go faster if
>>>>> you run
>>>>> multiple).
>>>>>
>>>>> St.Ack
>>>>>
>>>>>
>>>>> Billy wrote:
>>>>>> I forgot to say that once restart the master only uses about
>>>>>> 70mb of
>>>>>> memory
>>>>>>
>>>>>> Billy
>>>>>>
>>>>>> "Billy" <sa...@pearsonwholesale.com>
>>>>>> wrote
>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>
>>>>>>> I not sure of this but why does the master server use up so much
>>>>>>> memory.
>>>>>>> I been running an script that been inserting data into a
>>>>>>> table for a
>>>>>>> little over 24 hours and the master crashed because of
>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>
>>>>>>> So my question is why does the master use up so much memory
>>>>>>> at most
>>>>>>> it
>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>> table
>>>>>>> mapping.
>>>>>>>
>>>>>>> Is it cache or a memory leak?
>>>>>>>
>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>
>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>> servers
>>>>>>> about
>>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>>> about
>>>>>>> 1GB.
>>>>>>>
>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>> 886.61
>>>>>>> MB and that's with
>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>> compressed
>>>>>>> at the block level.
>>>>>>> From what I understand the master should have lower memory
>>>>>>> and cpu
>>>>>>> usage
>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>> keep up
>>>>>>> with all the data about the blocks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Re: hbase master heap space
Posted by Billy <sa...@pearsonwholesale.com>.
No problem I had the test script written so I could test in secs.
looks good here
Billy
"Bryan Duxbury" <br...@rapleaf.com> wrote in
message news:475A4291-05EE-469F-B3BD-55759DAE2A8E@rapleaf.com...
>I posted another version of the patch that fixes this problem, I think.
>Give it another try?
>
> (Sorry for relying on you to do the testing - I figure you already have
> the framework set up, and I'm currently trapped in an airport.)
>
> -Bryan
>
> On Dec 29, 2007, at 8:44 PM, Billy wrote:
>
>> found this in the rest log i running rest outside of master and logging
>> it
>>
>> 07/12/29 22:36:00 WARN rest: /api/search_index/scanner/3977a5e4:
>> java.lang.ArrayIndexOutOfBoundsException: 3
>> at
>> org.apache.hadoop.hbase.rest.ScannerHandler.doDelete
>> (ScannerHandler.java:132)
>> at
>> org.apache.hadoop.hbase.rest.Dispatcher.doDelete(Dispatcher.java:146)
>> at javax.servlet.http.HttpServlet.service(HttpServlet.java: 715)
>> at javax.servlet.http.HttpServlet.service(HttpServlet.java: 802)
>> at
>> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
>> at
>> org.mortbay.jetty.servlet.WebApplicationHandler.dispatch
>> (WebApplicationHandler.java:475)
>> at
>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java: 567)
>> at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
>> at
>> org.mortbay.jetty.servlet.WebApplicationContext.handle
>> (WebApplicationContext.java:635)
>> at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
>> at org.mortbay.http.HttpServer.service(HttpServer.java:954)
>> at org.mortbay.http.HttpConnection.service
>> (HttpConnection.java:814)
>> at
>> org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
>> at org.mortbay.http.HttpConnection.handle
>> (HttpConnection.java:831)
>> at
>> org.mortbay.http.SocketListener.handleConnection
>> (SocketListener.java:244)
>> at org.mortbay.util.ThreadedServer.handle
>> (ThreadedServer.java:357)
>> at org.mortbay.util.ThreadPool$PoolThread.run
>> (ThreadPool.java:534)
>>
>>
>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>> message
>> news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
>>> I've created an issue and submitted a patch to fix the problem.
>>> Billy, can you download the patch and check to see if it works alright?
>>>
>>> https://issues.apache.org/jira/browse/HADOOP-2504
>>>
>>> -Bryan
>>>
>>> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>>>
>>>> I checked and added the delete option to my code for the scanner
>>>> based on
>>>> the api from wiki but it looks like its not working at this time
>>>> basedo nthe
>>>> code and responce I got form the rest interfase. i get a "Not
>>>> hooked back up
>>>> yet" responce any idea on when this will be fixed?
>>>>
>>>> Thanks
>>>>
>>>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>>>> ScannerHandler.java
>>>>
>>>> public void doDelete(HttpServletRequest request, HttpServletResponse
>>>> response,
>>>> String[] pathSegments)
>>>> throws ServletException, IOException {
>>>> doMethodNotAllowed(response, "Not hooked back up yet!");
>>>> }
>>>>
>>>>
>>>> "Bryan Duxbury" <br...@rapleaf.com> wrote
>>>> in
>>>> message
>>>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>>>> Are you closing the scanners when you're done? If not, those might be
>>>>> hanging around for a long time. I don't think we've built in the
>>>>> proper
>>>>> timeout logic to make that work by itself.
>>>>>
>>>>> -Bryan
>>>>>
>>>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>>>
>>>>>> I was thanking the same thing and been running REST outside of the
>>>>>> Master on
>>>>>> each server for about 5 hours now and used the master as a
>>>>>> backup if
>>>>>> local
>>>>>> rest interface failed. You are right I seen a little faster
>>>>>> processing
>>>>>> time
>>>>>> from doing this vs. using just the master.
>>>>>>
>>>>>> Seams the problem is not with the master its self looks like
>>>>>> REST is
>>>>>> using
>>>>>> up more and more memory not sure but I thank its to do with inserts
>>>>>> maybe
>>>>>> not but the memory usage is going up I an doing a scanner 2 threads
>>>>>> reading
>>>>>> rows and processing the data and inserting it in to a separate table
>>>>>> building a inverted index.
>>>>>>
>>>>>> I will restart everything when this job is done and try to do just
>>>>>> inserts
>>>>>> and see if its the scanner or inserts.
>>>>>>
>>>>>> The master is holding at about 75mb and the rest interfaces are
>>>>>> up to
>>>>>> 400MB
>>>>>> and slowly going up on the ones running the jobs.
>>>>>>
>>>>>> I am still testing I will see what else I can come up with.
>>>>>>
>>>>>> Billy
>>>>>>
>>>>>>
>>>>>> "stack" <st...@duboce.net> wrote in
>>>>>> message
>>>>>> news:476C1AA8.3030306@duboce.net...
>>>>>>> Hey Billy:
>>>>>>>
>>>>>>> Master itself should use little memory and though it is not out
>>>>>>> of the
>>>>>>> realm of possibiliites, it should not have a leak.
>>>>>>>
>>>>>>> Are you running with the default heap size? You might want to
>>>>>>> give it
>>>>>>> more memory if you are (See
>>>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>>>
>>>>>>> If you are uploading all via the REST server running on the
>>>>>>> master, the
>>>>>>> problem as you speculate, could be in the REST servlet itself
>>>>>>> (though
>>>>>>> it
>>>>>>> looks like it shouldn't be holding on to anything having given it a
>>>>>>> cursory glance). You could try running the REST server
>>>>>>> independent of
>>>>>>> the
>>>>>>> master. Grep for 'Starting the REST Server' in this page,
>>>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>>>> (If you
>>>>>>> are
>>>>>>> only running one REST instance, your upload might go faster if
>>>>>>> you run
>>>>>>> multiple).
>>>>>>>
>>>>>>> St.Ack
>>>>>>>
>>>>>>>
>>>>>>> Billy wrote:
>>>>>>>> I forgot to say that once restart the master only uses about
>>>>>>>> 70mb of
>>>>>>>> memory
>>>>>>>>
>>>>>>>> Billy
>>>>>>>>
>>>>>>>> "Billy" <sa...@pearsonwholesale.com>
>>>>>>>> wrote
>>>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>>>
>>>>>>>>> I not sure of this but why does the master server use up so much
>>>>>>>>> memory.
>>>>>>>>> I been running an script that been inserting data into a
>>>>>>>>> table for a
>>>>>>>>> little over 24 hours and the master crashed because of
>>>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>>>
>>>>>>>>> So my question is why does the master use up so much memory
>>>>>>>>> at most
>>>>>>>>> it
>>>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>>>> table
>>>>>>>>> mapping.
>>>>>>>>>
>>>>>>>>> Is it cache or a memory leak?
>>>>>>>>>
>>>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>>>
>>>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>>>> servers
>>>>>>>>> about
>>>>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>>>>> about
>>>>>>>>> 1GB.
>>>>>>>>>
>>>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>>>> 886.61
>>>>>>>>> MB and that's with
>>>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>>>> compressed
>>>>>>>>> at the block level.
>>>>>>>>> From what I understand the master should have lower memory
>>>>>>>>> and cpu
>>>>>>>>> usage
>>>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>>>> keep up
>>>>>>>>> with all the data about the blocks.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Re: hbase master heap space
Posted by Bryan Duxbury <br...@rapleaf.com>.
I posted another version of the patch that fixes this problem, I
think. Give it another try?
(Sorry for relying on you to do the testing - I figure you already
have the framework set up, and I'm currently trapped in an airport.)
-Bryan
On Dec 29, 2007, at 8:44 PM, Billy wrote:
> found this in the rest log i running rest outside of master and
> logging it
>
> 07/12/29 22:36:00 WARN rest: /api/search_index/scanner/3977a5e4:
> java.lang.ArrayIndexOutOfBoundsException: 3
> at
> org.apache.hadoop.hbase.rest.ScannerHandler.doDelete
> (ScannerHandler.java:132)
> at
> org.apache.hadoop.hbase.rest.Dispatcher.doDelete(Dispatcher.java:146)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:
> 715)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:
> 802)
> at
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
> at
> org.mortbay.jetty.servlet.WebApplicationHandler.dispatch
> (WebApplicationHandler.java:475)
> at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:
> 567)
> at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
> at
> org.mortbay.jetty.servlet.WebApplicationContext.handle
> (WebApplicationContext.java:635)
> at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
> at org.mortbay.http.HttpServer.service(HttpServer.java:954)
> at org.mortbay.http.HttpConnection.service
> (HttpConnection.java:814)
> at
> org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
> at org.mortbay.http.HttpConnection.handle
> (HttpConnection.java:831)
> at
> org.mortbay.http.SocketListener.handleConnection
> (SocketListener.java:244)
> at org.mortbay.util.ThreadedServer.handle
> (ThreadedServer.java:357)
> at org.mortbay.util.ThreadPool$PoolThread.run
> (ThreadPool.java:534)
>
>
> "Bryan Duxbury" <br...@rapleaf.com> wrote in
> message news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
>> I've created an issue and submitted a patch to fix the problem.
>> Billy, can you download the patch and check to see if it works
>> alright?
>>
>> https://issues.apache.org/jira/browse/HADOOP-2504
>>
>> -Bryan
>>
>> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>>
>>> I checked and added the delete option to my code for the scanner
>>> based on
>>> the api from wiki but it looks like its not working at this time
>>> basedo nthe
>>> code and responce I got form the rest interfase. i get a "Not
>>> hooked back up
>>> yet" responce any idea on when this will be fixed?
>>>
>>> Thanks
>>>
>>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>>> ScannerHandler.java
>>>
>>> public void doDelete(HttpServletRequest request, HttpServletResponse
>>> response,
>>> String[] pathSegments)
>>> throws ServletException, IOException {
>>> doMethodNotAllowed(response, "Not hooked back up yet!");
>>> }
>>>
>>>
>>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>>> message
>>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>>> Are you closing the scanners when you're done? If not, those
>>>> might be
>>>> hanging around for a long time. I don't think we've built in the
>>>> proper
>>>> timeout logic to make that work by itself.
>>>>
>>>> -Bryan
>>>>
>>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>>
>>>>> I was thanking the same thing and been running REST outside of the
>>>>> Master on
>>>>> each server for about 5 hours now and used the master as a
>>>>> backup if
>>>>> local
>>>>> rest interface failed. You are right I seen a little faster
>>>>> processing
>>>>> time
>>>>> from doing this vs. using just the master.
>>>>>
>>>>> Seams the problem is not with the master its self looks like
>>>>> REST is
>>>>> using
>>>>> up more and more memory not sure but I thank its to do with
>>>>> inserts
>>>>> maybe
>>>>> not but the memory usage is going up I an doing a scanner 2
>>>>> threads
>>>>> reading
>>>>> rows and processing the data and inserting it in to a separate
>>>>> table
>>>>> building a inverted index.
>>>>>
>>>>> I will restart everything when this job is done and try to do just
>>>>> inserts
>>>>> and see if its the scanner or inserts.
>>>>>
>>>>> The master is holding at about 75mb and the rest interfaces are
>>>>> up to
>>>>> 400MB
>>>>> and slowly going up on the ones running the jobs.
>>>>>
>>>>> I am still testing I will see what else I can come up with.
>>>>>
>>>>> Billy
>>>>>
>>>>>
>>>>> "stack" <st...@duboce.net> wrote in
>>>>> message
>>>>> news:476C1AA8.3030306@duboce.net...
>>>>>> Hey Billy:
>>>>>>
>>>>>> Master itself should use little memory and though it is not out
>>>>>> of the
>>>>>> realm of possibiliites, it should not have a leak.
>>>>>>
>>>>>> Are you running with the default heap size? You might want to
>>>>>> give it
>>>>>> more memory if you are (See
>>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>>
>>>>>> If you are uploading all via the REST server running on the
>>>>>> master, the
>>>>>> problem as you speculate, could be in the REST servlet itself
>>>>>> (though
>>>>>> it
>>>>>> looks like it shouldn't be holding on to anything having given
>>>>>> it a
>>>>>> cursory glance). You could try running the REST server
>>>>>> independent of
>>>>>> the
>>>>>> master. Grep for 'Starting the REST Server' in this page,
>>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>>> (If you
>>>>>> are
>>>>>> only running one REST instance, your upload might go faster if
>>>>>> you run
>>>>>> multiple).
>>>>>>
>>>>>> St.Ack
>>>>>>
>>>>>>
>>>>>> Billy wrote:
>>>>>>> I forgot to say that once restart the master only uses about
>>>>>>> 70mb of
>>>>>>> memory
>>>>>>>
>>>>>>> Billy
>>>>>>>
>>>>>>> "Billy" <sa...@pearsonwholesale.com>
>>>>>>> wrote
>>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>>
>>>>>>>> I not sure of this but why does the master server use up so
>>>>>>>> much
>>>>>>>> memory.
>>>>>>>> I been running an script that been inserting data into a
>>>>>>>> table for a
>>>>>>>> little over 24 hours and the master crashed because of
>>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>>
>>>>>>>> So my question is why does the master use up so much memory
>>>>>>>> at most
>>>>>>>> it
>>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>>> table
>>>>>>>> mapping.
>>>>>>>>
>>>>>>>> Is it cache or a memory leak?
>>>>>>>>
>>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>>
>>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>>> servers
>>>>>>>> about
>>>>>>>> 51,932,760 edits and the master ran out of memory with a
>>>>>>>> heap of
>>>>>>>> about
>>>>>>>> 1GB.
>>>>>>>>
>>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>>> 886.61
>>>>>>>> MB and that's with
>>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>>> compressed
>>>>>>>> at the block level.
>>>>>>>> From what I understand the master should have lower memory
>>>>>>>> and cpu
>>>>>>>> usage
>>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>>> keep up
>>>>>>>> with all the data about the blocks.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>
Re: hbase master heap space
Posted by Billy <sa...@pearsonwholesale.com>.
found this in the rest log i running rest outside of master and logging it
07/12/29 22:36:00 WARN rest: /api/search_index/scanner/3977a5e4:
java.lang.ArrayIndexOutOfBoundsException: 3
at
org.apache.hadoop.hbase.rest.ScannerHandler.doDelete(ScannerHandler.java:132)
at
org.apache.hadoop.hbase.rest.Dispatcher.doDelete(Dispatcher.java:146)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:715)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:427)
at
org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:475)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:567)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1565)
at
org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:635)
at org.mortbay.http.HttpContext.handle(HttpContext.java:1517)
at org.mortbay.http.HttpServer.service(HttpServer.java:954)
at org.mortbay.http.HttpConnection.service(HttpConnection.java:814)
at
org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:981)
at org.mortbay.http.HttpConnection.handle(HttpConnection.java:831)
at
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:244)
at org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:357)
at org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:534)
"Bryan Duxbury" <br...@rapleaf.com> wrote in
message news:BE0CC670-92BE-41E4-9A94-EAAB72A09F02@rapleaf.com...
> I've created an issue and submitted a patch to fix the problem.
> Billy, can you download the patch and check to see if it works alright?
>
> https://issues.apache.org/jira/browse/HADOOP-2504
>
> -Bryan
>
> On Dec 29, 2007, at 3:36 PM, Billy wrote:
>
>> I checked and added the delete option to my code for the scanner
>> based on
>> the api from wiki but it looks like its not working at this time
>> basedo nthe
>> code and responce I got form the rest interfase. i get a "Not
>> hooked back up
>> yet" responce any idea on when this will be fixed?
>>
>> Thanks
>>
>> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
>> ScannerHandler.java
>>
>> public void doDelete(HttpServletRequest request, HttpServletResponse
>> response,
>> String[] pathSegments)
>> throws ServletException, IOException {
>> doMethodNotAllowed(response, "Not hooked back up yet!");
>> }
>>
>>
>> "Bryan Duxbury" <br...@rapleaf.com> wrote in
>> message
>> news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>>> Are you closing the scanners when you're done? If not, those might be
>>> hanging around for a long time. I don't think we've built in the
>>> proper
>>> timeout logic to make that work by itself.
>>>
>>> -Bryan
>>>
>>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>>
>>>> I was thanking the same thing and been running REST outside of the
>>>> Master on
>>>> each server for about 5 hours now and used the master as a
>>>> backup if
>>>> local
>>>> rest interface failed. You are right I seen a little faster
>>>> processing
>>>> time
>>>> from doing this vs. using just the master.
>>>>
>>>> Seams the problem is not with the master its self looks like
>>>> REST is
>>>> using
>>>> up more and more memory not sure but I thank its to do with inserts
>>>> maybe
>>>> not but the memory usage is going up I an doing a scanner 2 threads
>>>> reading
>>>> rows and processing the data and inserting it in to a separate table
>>>> building a inverted index.
>>>>
>>>> I will restart everything when this job is done and try to do just
>>>> inserts
>>>> and see if its the scanner or inserts.
>>>>
>>>> The master is holding at about 75mb and the rest interfaces are
>>>> up to
>>>> 400MB
>>>> and slowly going up on the ones running the jobs.
>>>>
>>>> I am still testing I will see what else I can come up with.
>>>>
>>>> Billy
>>>>
>>>>
>>>> "stack" <st...@duboce.net> wrote in
>>>> message
>>>> news:476C1AA8.3030306@duboce.net...
>>>>> Hey Billy:
>>>>>
>>>>> Master itself should use little memory and though it is not out
>>>>> of the
>>>>> realm of possibiliites, it should not have a leak.
>>>>>
>>>>> Are you running with the default heap size? You might want to
>>>>> give it
>>>>> more memory if you are (See
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>>
>>>>> If you are uploading all via the REST server running on the
>>>>> master, the
>>>>> problem as you speculate, could be in the REST servlet itself
>>>>> (though
>>>>> it
>>>>> looks like it shouldn't be holding on to anything having given it a
>>>>> cursory glance). You could try running the REST server
>>>>> independent of
>>>>> the
>>>>> master. Grep for 'Starting the REST Server' in this page,
>>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>>> (If you
>>>>> are
>>>>> only running one REST instance, your upload might go faster if
>>>>> you run
>>>>> multiple).
>>>>>
>>>>> St.Ack
>>>>>
>>>>>
>>>>> Billy wrote:
>>>>>> I forgot to say that once restart the master only uses about
>>>>>> 70mb of
>>>>>> memory
>>>>>>
>>>>>> Billy
>>>>>>
>>>>>> "Billy" <sa...@pearsonwholesale.com>
>>>>>> wrote
>>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>>
>>>>>>> I not sure of this but why does the master server use up so much
>>>>>>> memory.
>>>>>>> I been running an script that been inserting data into a
>>>>>>> table for a
>>>>>>> little over 24 hours and the master crashed because of
>>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>>
>>>>>>> So my question is why does the master use up so much memory
>>>>>>> at most
>>>>>>> it
>>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>>> table
>>>>>>> mapping.
>>>>>>>
>>>>>>> Is it cache or a memory leak?
>>>>>>>
>>>>>>> I am using the rest interface so could that be the reason?
>>>>>>>
>>>>>>> I inserted according to the high edit ids on all the region
>>>>>>> servers
>>>>>>> about
>>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>>> about
>>>>>>> 1GB.
>>>>>>>
>>>>>>> The other side to this is the data I inserted is only taking up
>>>>>>> 886.61
>>>>>>> MB and that's with
>>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>>> compressed
>>>>>>> at the block level.
>>>>>>> From what I understand the master should have lower memory
>>>>>>> and cpu
>>>>>>> usage
>>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>>> keep up
>>>>>>> with all the data about the blocks.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Re: hbase master heap space
Posted by Bryan Duxbury <br...@rapleaf.com>.
I've created an issue and submitted a patch to fix the problem.
Billy, can you download the patch and check to see if it works alright?
https://issues.apache.org/jira/browse/HADOOP-2504
-Bryan
On Dec 29, 2007, at 3:36 PM, Billy wrote:
> I checked and added the delete option to my code for the scanner
> based on
> the api from wiki but it looks like its not working at this time
> basedo nthe
> code and responce I got form the rest interfase. i get a "Not
> hooked back up
> yet" responce any idea on when this will be fixed?
>
> Thanks
>
> src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/
> ScannerHandler.java
>
> public void doDelete(HttpServletRequest request, HttpServletResponse
> response,
> String[] pathSegments)
> throws ServletException, IOException {
> doMethodNotAllowed(response, "Not hooked back up yet!");
> }
>
>
> "Bryan Duxbury" <br...@rapleaf.com> wrote in
> message news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
>> Are you closing the scanners when you're done? If not, those might be
>> hanging around for a long time. I don't think we've built in the
>> proper
>> timeout logic to make that work by itself.
>>
>> -Bryan
>>
>> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>>
>>> I was thanking the same thing and been running REST outside of the
>>> Master on
>>> each server for about 5 hours now and used the master as a
>>> backup if
>>> local
>>> rest interface failed. You are right I seen a little faster
>>> processing
>>> time
>>> from doing this vs. using just the master.
>>>
>>> Seams the problem is not with the master its self looks like
>>> REST is
>>> using
>>> up more and more memory not sure but I thank its to do with inserts
>>> maybe
>>> not but the memory usage is going up I an doing a scanner 2 threads
>>> reading
>>> rows and processing the data and inserting it in to a separate table
>>> building a inverted index.
>>>
>>> I will restart everything when this job is done and try to do just
>>> inserts
>>> and see if its the scanner or inserts.
>>>
>>> The master is holding at about 75mb and the rest interfaces are
>>> up to
>>> 400MB
>>> and slowly going up on the ones running the jobs.
>>>
>>> I am still testing I will see what else I can come up with.
>>>
>>> Billy
>>>
>>>
>>> "stack" <st...@duboce.net> wrote in message
>>> news:476C1AA8.3030306@duboce.net...
>>>> Hey Billy:
>>>>
>>>> Master itself should use little memory and though it is not out
>>>> of the
>>>> realm of possibiliites, it should not have a leak.
>>>>
>>>> Are you running with the default heap size? You might want to
>>>> give it
>>>> more memory if you are (See
>>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>>
>>>> If you are uploading all via the REST server running on the
>>>> master, the
>>>> problem as you speculate, could be in the REST servlet itself
>>>> (though
>>>> it
>>>> looks like it shouldn't be holding on to anything having given it a
>>>> cursory glance). You could try running the REST server
>>>> independent of
>>>> the
>>>> master. Grep for 'Starting the REST Server' in this page,
>>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how
>>>> (If you
>>>> are
>>>> only running one REST instance, your upload might go faster if
>>>> you run
>>>> multiple).
>>>>
>>>> St.Ack
>>>>
>>>>
>>>> Billy wrote:
>>>>> I forgot to say that once restart the master only uses about
>>>>> 70mb of
>>>>> memory
>>>>>
>>>>> Billy
>>>>>
>>>>> "Billy" <sa...@pearsonwholesale.com> wrote
>>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>>
>>>>>> I not sure of this but why does the master server use up so much
>>>>>> memory.
>>>>>> I been running an script that been inserting data into a
>>>>>> table for a
>>>>>> little over 24 hours and the master crashed because of
>>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>>
>>>>>> So my question is why does the master use up so much memory
>>>>>> at most
>>>>>> it
>>>>>> should store the -ROOT-,.META. tables in memory and block to
>>>>>> table
>>>>>> mapping.
>>>>>>
>>>>>> Is it cache or a memory leak?
>>>>>>
>>>>>> I am using the rest interface so could that be the reason?
>>>>>>
>>>>>> I inserted according to the high edit ids on all the region
>>>>>> servers
>>>>>> about
>>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>>> about
>>>>>> 1GB.
>>>>>>
>>>>>> The other side to this is the data I inserted is only taking up
>>>>>> 886.61
>>>>>> MB and that's with
>>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>>> compressed
>>>>>> at the block level.
>>>>>> From what I understand the master should have lower memory
>>>>>> and cpu
>>>>>> usage
>>>>>> and the namenode on hadoop should be the memory hog it has to
>>>>>> keep up
>>>>>> with all the data about the blocks.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>
Re: hbase master heap space
Posted by Billy <sa...@pearsonwholesale.com>.
I checked and added the delete option to my code for the scanner based on
the api from wiki but it looks like its not working at this time basedo nthe
code and responce I got form the rest interfase. i get a "Not hooked back up
yet" responce any idea on when this will be fixed?
Thanks
src/contrib/hbase/src/java/org/apache/hadoop/hbase/rest/ScannerHandler.java
public void doDelete(HttpServletRequest request, HttpServletResponse
response,
String[] pathSegments)
throws ServletException, IOException {
doMethodNotAllowed(response, "Not hooked back up yet!");
}
"Bryan Duxbury" <br...@rapleaf.com> wrote in
message news:0459216F-F3F0-46C1-B7DB-57A6479BD809@rapleaf.com...
> Are you closing the scanners when you're done? If not, those might be
> hanging around for a long time. I don't think we've built in the proper
> timeout logic to make that work by itself.
>
> -Bryan
>
> On Dec 21, 2007, at 5:10 PM, Billy wrote:
>
>> I was thanking the same thing and been running REST outside of the
>> Master on
>> each server for about 5 hours now and used the master as a backup if
>> local
>> rest interface failed. You are right I seen a little faster processing
>> time
>> from doing this vs. using just the master.
>>
>> Seams the problem is not with the master its self looks like REST is
>> using
>> up more and more memory not sure but I thank its to do with inserts
>> maybe
>> not but the memory usage is going up I an doing a scanner 2 threads
>> reading
>> rows and processing the data and inserting it in to a separate table
>> building a inverted index.
>>
>> I will restart everything when this job is done and try to do just
>> inserts
>> and see if its the scanner or inserts.
>>
>> The master is holding at about 75mb and the rest interfaces are up to
>> 400MB
>> and slowly going up on the ones running the jobs.
>>
>> I am still testing I will see what else I can come up with.
>>
>> Billy
>>
>>
>> "stack" <st...@duboce.net> wrote in message
>> news:476C1AA8.3030306@duboce.net...
>>> Hey Billy:
>>>
>>> Master itself should use little memory and though it is not out of the
>>> realm of possibiliites, it should not have a leak.
>>>
>>> Are you running with the default heap size? You might want to give it
>>> more memory if you are (See
>>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>>
>>> If you are uploading all via the REST server running on the master, the
>>> problem as you speculate, could be in the REST servlet itself (though
>>> it
>>> looks like it shouldn't be holding on to anything having given it a
>>> cursory glance). You could try running the REST server independent of
>>> the
>>> master. Grep for 'Starting the REST Server' in this page,
>>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If you
>>> are
>>> only running one REST instance, your upload might go faster if you run
>>> multiple).
>>>
>>> St.Ack
>>>
>>>
>>> Billy wrote:
>>>> I forgot to say that once restart the master only uses about 70mb of
>>>> memory
>>>>
>>>> Billy
>>>>
>>>> "Billy" <sa...@pearsonwholesale.com> wrote
>>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>>
>>>>> I not sure of this but why does the master server use up so much
>>>>> memory.
>>>>> I been running an script that been inserting data into a table for a
>>>>> little over 24 hours and the master crashed because of
>>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>>
>>>>> So my question is why does the master use up so much memory at most
>>>>> it
>>>>> should store the -ROOT-,.META. tables in memory and block to table
>>>>> mapping.
>>>>>
>>>>> Is it cache or a memory leak?
>>>>>
>>>>> I am using the rest interface so could that be the reason?
>>>>>
>>>>> I inserted according to the high edit ids on all the region servers
>>>>> about
>>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>>> about
>>>>> 1GB.
>>>>>
>>>>> The other side to this is the data I inserted is only taking up
>>>>> 886.61
>>>>> MB and that's with
>>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>>> compressed
>>>>> at the block level.
>>>>> From what I understand the master should have lower memory and cpu
>>>>> usage
>>>>> and the namenode on hadoop should be the memory hog it has to keep up
>>>>> with all the data about the blocks.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
Re: hbase master heap space
Posted by Bryan Duxbury <br...@rapleaf.com>.
Are you closing the scanners when you're done? If not, those might be
hanging around for a long time. I don't think we've built in the
proper timeout logic to make that work by itself.
-Bryan
On Dec 21, 2007, at 5:10 PM, Billy wrote:
> I was thanking the same thing and been running REST outside of the
> Master on
> each server for about 5 hours now and used the master as a backup
> if local
> rest interface failed. You are right I seen a little faster
> processing time
> from doing this vs. using just the master.
>
> Seams the problem is not with the master its self looks like REST
> is using
> up more and more memory not sure but I thank its to do with inserts
> maybe
> not but the memory usage is going up I an doing a scanner 2 threads
> reading
> rows and processing the data and inserting it in to a separate table
> building a inverted index.
>
> I will restart everything when this job is done and try to do just
> inserts
> and see if its the scanner or inserts.
>
> The master is holding at about 75mb and the rest interfaces are up
> to 400MB
> and slowly going up on the ones running the jobs.
>
> I am still testing I will see what else I can come up with.
>
> Billy
>
>
> "stack" <st...@duboce.net> wrote in message
> news:476C1AA8.3030306@duboce.net...
>> Hey Billy:
>>
>> Master itself should use little memory and though it is not out of
>> the
>> realm of possibiliites, it should not have a leak.
>>
>> Are you running with the default heap size? You might want to
>> give it
>> more memory if you are (See
>> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>>
>> If you are uploading all via the REST server running on the
>> master, the
>> problem as you speculate, could be in the REST servlet itself
>> (though it
>> looks like it shouldn't be holding on to anything having given it a
>> cursory glance). You could try running the REST server
>> independent of the
>> master. Grep for 'Starting the REST Server' in this page,
>> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If
>> you are
>> only running one REST instance, your upload might go faster if you
>> run
>> multiple).
>>
>> St.Ack
>>
>>
>> Billy wrote:
>>> I forgot to say that once restart the master only uses about 70mb of
>>> memory
>>>
>>> Billy
>>>
>>> "Billy" <sa...@pearsonwholesale.com> wrote
>>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>>
>>>> I not sure of this but why does the master server use up so much
>>>> memory.
>>>> I been running an script that been inserting data into a table
>>>> for a
>>>> little over 24 hours and the master crashed because of
>>>> java.lang.OutOfMemoryError: Java heap space.
>>>>
>>>> So my question is why does the master use up so much memory at
>>>> most it
>>>> should store the -ROOT-,.META. tables in memory and block to table
>>>> mapping.
>>>>
>>>> Is it cache or a memory leak?
>>>>
>>>> I am using the rest interface so could that be the reason?
>>>>
>>>> I inserted according to the high edit ids on all the region servers
>>>> about
>>>> 51,932,760 edits and the master ran out of memory with a heap of
>>>> about
>>>> 1GB.
>>>>
>>>> The other side to this is the data I inserted is only taking up
>>>> 886.61
>>>> MB and that's with
>>>> dfs.replication set to 2 so half that is only 440MB of data
>>>> compressed
>>>> at the block level.
>>>> From what I understand the master should have lower memory and
>>>> cpu usage
>>>> and the namenode on hadoop should be the memory hog it has to
>>>> keep up
>>>> with all the data about the blocks.
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
>
Re: hbase master heap space
Posted by Billy <sa...@pearsonwholesale.com>.
I was thanking the same thing and been running REST outside of the Master on
each server for about 5 hours now and used the master as a backup if local
rest interface failed. You are right I seen a little faster processing time
from doing this vs. using just the master.
Seams the problem is not with the master its self looks like REST is using
up more and more memory not sure but I thank its to do with inserts maybe
not but the memory usage is going up I an doing a scanner 2 threads reading
rows and processing the data and inserting it in to a separate table
building a inverted index.
I will restart everything when this job is done and try to do just inserts
and see if its the scanner or inserts.
The master is holding at about 75mb and the rest interfaces are up to 400MB
and slowly going up on the ones running the jobs.
I am still testing I will see what else I can come up with.
Billy
"stack" <st...@duboce.net> wrote in message
news:476C1AA8.3030306@duboce.net...
> Hey Billy:
>
> Master itself should use little memory and though it is not out of the
> realm of possibiliites, it should not have a leak.
>
> Are you running with the default heap size? You might want to give it
> more memory if you are (See
> http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
>
> If you are uploading all via the REST server running on the master, the
> problem as you speculate, could be in the REST servlet itself (though it
> looks like it shouldn't be holding on to anything having given it a
> cursory glance). You could try running the REST server independent of the
> master. Grep for 'Starting the REST Server' in this page,
> http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If you are
> only running one REST instance, your upload might go faster if you run
> multiple).
>
> St.Ack
>
>
> Billy wrote:
>> I forgot to say that once restart the master only uses about 70mb of
>> memory
>>
>> Billy
>>
>> "Billy" <sa...@pearsonwholesale.com> wrote
>> in message news:fkejpo$u8c$1@ger.gmane.org...
>>
>>> I not sure of this but why does the master server use up so much memory.
>>> I been running an script that been inserting data into a table for a
>>> little over 24 hours and the master crashed because of
>>> java.lang.OutOfMemoryError: Java heap space.
>>>
>>> So my question is why does the master use up so much memory at most it
>>> should store the -ROOT-,.META. tables in memory and block to table
>>> mapping.
>>>
>>> Is it cache or a memory leak?
>>>
>>> I am using the rest interface so could that be the reason?
>>>
>>> I inserted according to the high edit ids on all the region servers
>>> about
>>> 51,932,760 edits and the master ran out of memory with a heap of about
>>> 1GB.
>>>
>>> The other side to this is the data I inserted is only taking up 886.61
>>> MB and that's with
>>> dfs.replication set to 2 so half that is only 440MB of data compressed
>>> at the block level.
>>> From what I understand the master should have lower memory and cpu usage
>>> and the namenode on hadoop should be the memory hog it has to keep up
>>> with all the data about the blocks.
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>
Re: hbase master heap space
Posted by stack <st...@duboce.net>.
Hey Billy:
Master itself should use little memory and though it is not out of the
realm of possibiliites, it should not have a leak.
Are you running with the default heap size? You might want to give it
more memory if you are (See
http://wiki.apache.org/lucene-hadoop/Hbase/FAQ#3 for how).
If you are uploading all via the REST server running on the master, the
problem as you speculate, could be in the REST servlet itself (though it
looks like it shouldn't be holding on to anything having given it a
cursory glance). You could try running the REST server independent of
the master. Grep for 'Starting the REST Server' in this page,
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseRest, for how (If you
are only running one REST instance, your upload might go faster if you
run multiple).
St.Ack
Billy wrote:
> I forgot to say that once restart the master only uses about 70mb of memory
>
> Billy
>
> "Billy" <sa...@pearsonwholesale.com> wrote in
> message news:fkejpo$u8c$1@ger.gmane.org...
>
>> I not sure of this but why does the master server use up so much memory. I
>> been running an script that been inserting data into a table for a little
>> over 24 hours and the master crashed because of java.lang.OutOfMemoryError:
>> Java heap space.
>>
>> So my question is why does the master use up so much memory at most it
>> should store the -ROOT-,.META. tables in memory and block to table
>> mapping.
>>
>> Is it cache or a memory leak?
>>
>> I am using the rest interface so could that be the reason?
>>
>> I inserted according to the high edit ids on all the region servers about
>> 51,932,760 edits and the master ran out of memory with a heap of about
>> 1GB.
>>
>> The other side to this is the data I inserted is only taking up 886.61 MB
>> and that's with
>> dfs.replication set to 2 so half that is only 440MB of data compressed at
>> the block level.
>> From what I understand the master should have lower memory and cpu usage
>> and the namenode on hadoop should be the memory hog it has to keep up with
>> all the data about the blocks.
>>
>>
>>
>>
>
>
>
>
Re: hbase master heap space
Posted by Billy <sa...@pearsonwholesale.com>.
I forgot to say that once restart the master only uses about 70mb of memory
Billy
"Billy" <sa...@pearsonwholesale.com> wrote in
message news:fkejpo$u8c$1@ger.gmane.org...
>I not sure of this but why does the master server use up so much memory. I
>been running an script that been inserting data into a table for a little
>over 24 hours and the master crashed because of java.lang.OutOfMemoryError:
>Java heap space.
>
> So my question is why does the master use up so much memory at most it
> should store the -ROOT-,.META. tables in memory and block to table
> mapping.
>
> Is it cache or a memory leak?
>
> I am using the rest interface so could that be the reason?
>
> I inserted according to the high edit ids on all the region servers about
> 51,932,760 edits and the master ran out of memory with a heap of about
> 1GB.
>
> The other side to this is the data I inserted is only taking up 886.61 MB
> and that's with
> dfs.replication set to 2 so half that is only 440MB of data compressed at
> the block level.
> From what I understand the master should have lower memory and cpu usage
> and the namenode on hadoop should be the memory hog it has to keep up with
> all the data about the blocks.
>
>
>