You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Oliver Bucaojit (JIRA)" <ji...@apache.org> on 2015/10/09 04:11:26 UTC

[jira] [Commented] (TRAFODION-1207) LP Bug: 1449190 - hbase split starvation due to transactions.

    [ https://issues.apache.org/jira/browse/TRAFODION-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949769#comment-14949769 ] 

Oliver Bucaojit commented on TRAFODION-1207:
--------------------------------------------

This issue is accounted for in another JIRA and is currently being delivered.  Link to JIRA:
https://issues.apache.org/jira/browse/TRAFODION-34
The fix is to flush transactional state before region splitting and reading it back on region open. 

> LP Bug: 1449190 - hbase split starvation due to transactions.
> -------------------------------------------------------------
>
>                 Key: TRAFODION-1207
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1207
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: dtm
>            Reporter: Guy Groulx
>            Assignee: Oliver Bucaojit
>            Priority: Critical
>             Fix For: 2.0-incubating
>
>
> We ran a longevity test on a system.    Running OE with 512 drivers.
> Our max hfile was set to 10GB.
> After a while it was noticed in some of the hbase regionserver logs 
> 2015-04-27 10:35:06,990 INFO  [regionserver60020-splits-1430121725808] transactional.TrxRegionObserver: Delaying split due to transactions present. Delayed : 153 minute(s) on TRAFODION.JAVABENCH.OE_ORDERLINE_512,\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00,1430066791987.a1e39f281243d24c45d615c1b950f2a8.
> 2015-04-27 10:35:13,926 INFO  [regionserver60020-splits-1430123472882] transactional.TrxRegionObserver: Delaying split due to transactions present. Delayed : 124 minute(s) on TRAFODION.JAVABENCH.OE_ORDERLINE_512,\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00,1430066791987.a1e39f281243d24c45d615c1b950f2a8.
> Looking at the hdfs GUI:
> Contents of directory /apps/hbase/data/data/default/TRAFODION.JAVABENCH.OE_ORDERLINE_512/a1e39f281243d24c45d615c1b950f2a8/04ae7ce619d24b0094d85d5c39ebf8a6 	file	72.49 MB
> 559232b70b5340ddaa289a30dc4d7d2c	file	14.66 GB    <== This is over 14.66GB.
> 6c253a61ee344b1bb39d2f3a669103d3	file	72.56 MB
> 8837fc13d3a241d493b3ddcbd160d869	file	72.49 MB
> 901c5708daa048599de8d1441ed5ea89	file	72.48 MB
> b9d4bb1179414f9686f1f3271a2b434b	file	72.56 MB
> bbe2057994194a2693897bb5323a89fd	file	72.56 MB
> Notice how the 2nd entry is over 10GB.   It can't split because we have active transactions.   And because our 512 drivers are not letting up, the split is starving out.
> Once we killed the drivers, stopping new transactions, the split happened almost instantly.
> Hall, Gary	winding down...
> 	1:48 PM
> 2015-04-27 17:48:57,235 INFO  [regionserver60020-splits-1430135811050] regionserver.SplitRequest: Region split, hbase:meta updated, and report to master. Parent=TRAFODION.JAVABENCH.OE_ORDERLINE_512,\x00\x00\x00\x0F\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00,1430066791987.2fb76848eeb9b1516ae7a80500e8870c., new regions: TRAFODION.JAVABENCH.OE_ORDERLINE_512,\x00\x00\x00\x0F\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00,1430135811140.4cd059f95ecba06425a5c39592de4157., TRAFODION.JAVABENCH.OE_ORDERLINE_512,\x00\x00\x00\x0F\x80\x00\x00\xE8\x80\x00\x00\x05\x80\x00\x000\x80\x00\x00\x07,1430135811140.62d1d89a34607dfde0ec66d18ad6e91f.. Split took 5hrs, 52mins, 6sec
> Above says 5hr 52 mins but it actually took less than a minute once the transactions stopped.
> We understand that split must be delayed until transactions have stopped, but in a high transaction environments, we need to make sure that a window will be given for the splits to actually happen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)