You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2015/11/03 23:25:27 UTC
[jira] [Commented] (HBASE-14736) ITBLL debugging search tool OOMEs on big dataset

    [ https://issues.apache.org/jira/browse/HBASE-14736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14988294#comment-14988294 ] 

stack commented on HBASE-14736:
-------------------------------

I tried just getting first 1k in each partition. Was then having interesting issue where WALs were moving out from under me while the task ran... they were no longer found.  I got the first-1k job to pass but it found no keys.... Need to dig in more. Seems like when the scale is large, these tools as they are written no longer work (I lost use of cluster so could pursue no further for the time being).

> ITBLL debugging search tool OOMEs on big dataset
> ------------------------------------------------
>
>                 Key: HBASE-14736
>                 URL: https://issues.apache.org/jira/browse/HBASE-14736
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>
> I ran an ITBLL on an 80 node cluster sized to do 100B items. The job failed with 300M undefined items (branch-1). I tried to run the search tool debugging the loss -- see https://docs.google.com/document/d/14Tvu5yWYNBDFkh8xCqLkU9tlyNWhJv3GjDGOkqZU1eE/edit# -- but it OOME'd:
> {code}
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>         at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:834)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:896)
>         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:697)
>         at java.io.DataInputStream.readInt(DataInputStream.java:387)
>         at org.apache.hadoop.io.SequenceFile$Reader.nextRawKey(SequenceFile.java:2452)
>         at org.apache.hadoop.mapreduce.lib.input.SequenceFileAsBinaryInputFormat$SequenceFileAsBinaryRecordReader.nextKeyValue(SequenceFileAsBinaryInputFormat.java:119)
>         at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Search.readFileToSearch(IntegrationTestBigLinkedList.java:775)
>         at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Search.readKeysToSearch(IntegrationTestBigLinkedList.java:757)
>         at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Search.run(IntegrationTestBigLinkedList.java:726)
>         at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Search.run(IntegrationTestBigLinkedList.java:657)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList.runTestFromCommandLine(IntegrationTestBigLinkedList.java:1646)
>         at org.apache.hadoop.hbase.IntegrationTestBase.doWork(IntegrationTestBase.java:123)
>         at org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:112)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList.main(IntegrationTestBigLinkedList.java:1686)
> {code}
> Its trying to build a sorted set out of the 300M items.... Dang.
> The 10B test passed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)