You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "chunhui shen (Created) (JIRA)" <ji...@apache.org> on 2012/03/13 05:03:59 UTC

[jira] [Created] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Multi concurrent flushcache() for one region could cause data loss
------------------------------------------------------------------

                 Key: HBASE-5568
                 URL: https://issues.apache.org/jira/browse/HBASE-5568
             Project: HBase
          Issue Type: Bug
          Components: regionserver
            Reporter: chunhui shen
            Assignee: chunhui shen


We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.

Logs in RS for region e9d827913a056e696c39bc569ea3

2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
f99f., current region memstore size 128.0m
2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
f99f., current region memstore size 134.8m
2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
f99f., current region memstore size 6.8m
2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231318#comment-13231318 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

Patch v2 looks good.
Will integrate if there is no objection.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231467#comment-13231467 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

Integrated to 0.94 as well.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228215#comment-13228215 ] 

Lars Hofhansl commented on HBASE-5568:
--------------------------------------

Thanks Chunhui. In that case, wouldn't a better solution be to pull <result> out of the try block and then only set flushing to false if internalFlushCache returned true?

                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232444#comment-13232444 ] 

ramkrishna.s.vasudevan commented on HBASE-5568:
-----------------------------------------------

Sorry Ted.  I did not notice that. :(
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-5568:
--------------------------------

    Attachment: HBASE-5568-90.patch

Submit a patch for 90 version
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-5568:
------------------------------

    Attachment: HBASE-5568.patch

Re-attaching Chunhui's patch
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231464#comment-13231464 ] 

Hudson commented on HBASE-5568:
-------------------------------

Integrated in HBase-TRUNK #2684 (See [https://builds.apache.org/job/HBase-TRUNK/2684/])
    HBASE-5568 Multi concurrent flushcache() for one region could cause data loss (Chunhui) (Revision 1301639)

     Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java

                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228229#comment-13228229 ] 

chunhui shen commented on HBASE-5568:
-------------------------------------

bq. only set flushing to false if internalFlushCache returned true?
internalFlushCache() may throws IOException, If we only flushing to false in the case internalFlushCache() returned true, I think no flushcache could happen if throws IOException.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229138#comment-13229138 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

@Chunhui:
Can you reattach patch for TRUNK and submit it to Hadoop QA ?
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231361#comment-13231361 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

Integrated to TRUNK.

Thanks for the patch Chunhui.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229019#comment-13229019 ] 

Lars Hofhansl commented on HBASE-5568:
--------------------------------------

@chunhui: looked at the patch again. Makes sense now. +1
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "stack (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231507#comment-13231507 ] 

stack commented on HBASE-5568:
------------------------------

+1 Good find Chunhui.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Reopened) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu reopened HBASE-5568:
-------------------------------


Patch for 0.92 hasn't been integrated.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229495#comment-13229495 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

I can reproduce TestStore failure on my laptop.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228192#comment-13228192 ] 

Lars Hofhansl commented on HBASE-5568:
--------------------------------------

Hi Chunhui. You patch only moves the synchronized block out of the try block. Could you explain why this fixes the issue?
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228995#comment-13228995 ] 

ramkrishna.s.vasudevan commented on HBASE-5568:
-----------------------------------------------

@Chunhui

Is this some how related with HBASE-5312?
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-5568:
------------------------------

    Comment: was deleted

(was: @Chunhui:
Can you reattach patch for TRUNK and submit it to Hadoop QA ?)
    
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231324#comment-13231324 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

@Chunhui:
A patch for 0.92 is needed:
{code}
p0 HBASE-5568v2.patch
...
Hunk #1 FAILED at 152.
1 out of 1 hunk FAILED -- saving rejects to file src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java.rej
{code}
Please also update patch for 0.90 as well.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229425#comment-13229425 ] 

Hadoop QA commented on HBASE-5568:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518339/HBASE-5568.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.regionserver.TestStore
                  org.apache.hadoop.hbase.replication.TestReplicationPeer
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1187//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1187//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1187//console

This message is automatically generated.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229865#comment-13229865 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

Yes.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Issue Comment Edited] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231361#comment-13231361 ] 

Zhihong Yu edited comment on HBASE-5568 at 3/16/12 4:41 PM:
------------------------------------------------------------

Integrated to TRUNK.

Thanks for the patch Chunhui.

Thanks for the review Ramkrishna and Lars.
                
      was (Author: zhihyu@ebaysf.com):
    Integrated to TRUNK.

Thanks for the patch Chunhui.
                  
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231540#comment-13231540 ] 

Hudson commented on HBASE-5568:
-------------------------------

Integrated in HBase-0.94 #34 (See [https://builds.apache.org/job/HBase-0.94/34/])
    HBASE-5568 Multi concurrent flushcache() for one region could cause data loss (Chunhui) (Revision 1301676)

     Result = SUCCESS
tedyu : 
Files : 
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java

                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-5568:
------------------------------

    Attachment:     (was: HBASE-5568v2.patch)
    
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-5568:
--------------------------------

    Attachment: HBASE-5568v2.patch

patch v2 for test case:
change TestStore#testDeleteExpiredStoreFiles#ttl from 1 to 4 
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-5568:
------------------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Resolving as committed to all versions.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229469#comment-13229469 ] 

ramkrishna.s.vasudevan commented on HBASE-5568:
-----------------------------------------------

{code}
hdfs_out=org.apache.hadoop.fs.FSDataOutputStream@61736e
java.lang.NoSuchMethodException: org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.getNumCurrentReplicas()
	at java.lang.Class.getDeclaredMethod(Class.java:1937)
	at org.apache.hadoop.hbase.regionserver.wal.HLog.getGetNumCurrentReplicas(HLog.java:460)
	at org.apache.hadoop.hbase.regionserver.wal.HLog.<init>(HLog.java:443)
	at org.apache.hadoop.hbase.regionserver.wal.HLog.<init>(HLog.java:341)
	at org.apache.hadoop.hbase.regionserver.TestStore.init(TestStore.java:147)
	at org.apache.hadoop.hbase.regionserver.TestStore.testDeleteExpiredStoreFiles(TestStore.java:162)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{code}
In TestStore failures i saw above exception traces.  Its not related to this fix but just wanted to ensure is this expected? or are we missing something?
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229861#comment-13229861 ] 

Lars Hofhansl commented on HBASE-5568:
--------------------------------------

@Ted: You see the failure only with this patch applied?
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232446#comment-13232446 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

Integrated 0.92 patch to 0.92 branch.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Lars Hofhansl (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars Hofhansl updated HBASE-5568:
---------------------------------

    Fix Version/s: 0.96.0
                   0.94.0
                   0.92.2

I think we want this in at least 0.92, 0.94, and 0.96.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231517#comment-13231517 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

Integrated to 0.90 branch as well.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231804#comment-13231804 ] 

Hadoop QA commented on HBASE-5568:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518769/HBASE-5568-92v2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    -1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1215//console

This message is automatically generated.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Ted Yu (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu resolved HBASE-5568.
---------------------------

      Resolution: Fixed
    Hadoop Flags: Reviewed

Integrated to 0.92
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229894#comment-13229894 ] 

chunhui shen commented on HBASE-5568:
-------------------------------------

I think it's the problem of test case TestStore#testDeleteExpiredStoreFiles.
See its code:
{code}
int storeFileNum = 4;
int ttl = 1;
...    
hcd.setTimeToLive(ttl);
init(getName(), conf, hcd);

    long sleepTime = this.store.scanInfo.getTtl() / storeFileNum;
    ...
    for (int i = 1; i <= storeFileNum; i++) {
      LOG.info("Adding some data for the store file #" + i);
      timeStamp = EnvironmentEdgeManager.currentTimeMillis();
      this.store.add(new KeyValue(row, family, qf1, timeStamp, (byte[]) null));
      this.store.add(new KeyValue(row, family, qf2, timeStamp, (byte[]) null));
      this.store.add(new KeyValue(row, family, qf3, timeStamp, (byte[]) null));
      flush(i);
      Thread.sleep(sleepTime);
    }
{code}

Because ttl=1, so sleepTime=250ms, so for the 4 store files , the maxExpiredTimeStamp discrepancy is only 250ms.

so they may be expired at the same time if you run slowly and then CompactionRequest cr = this.store.requestCompaction(); cr.getFiles().size() return 3.
We can ensure this from the logs:
https://builds.apache.org/job/PreCommit-HBASE-Build/1187//testReport/org.apache.hadoop.hbase.regionserver/TestStore/testDeleteExpiredStoreFiles/

{code}2012-03-14 17:19:14,672 INFO  [pool-1-thread-1] regionserver.Store(1002): Completed compaction of 1 file(s) in family of table,,1331745551742.011a93cc4f763307dc36f577662db4b1. into none, size=none; total size for store is 2.4k
2012-03-14 17:19:14,923 INFO  [pool-1-thread-1] compactions.CompactSelection(104): Deleting the expired store file by compaction: /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/4d75667a-f352-40cc-978a-f36fec71baf2/TestStoretestDeleteExpiredStoreFiles/011a93cc4f763307dc36f577662db4b1/family/489673e6568241c6bb500b34d7ff29ad whose maxTimeStamp is 1331745552397 while the max expired timestamp is 1331745553923
2012-03-14 17:19:14,923 INFO  [pool-1-thread-1] compactions.CompactSelection(104): Deleting the expired store file by compaction: /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/4d75667a-f352-40cc-978a-f36fec71baf2/TestStoretestDeleteExpiredStoreFiles/011a93cc4f763307dc36f577662db4b1/family/7ac26f52fd214aca88bd555f3c82dd91 whose maxTimeStamp is 1331745552671 while the max expired timestamp is 1331745553923
2012-03-14 17:19:14,923 INFO  [pool-1-thread-1] compactions.CompactSelection(104): Deleting the expired store file by compaction: /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/trunk/target/test-data/4d75667a-f352-40cc-978a-f36fec71baf2/TestStoretestDeleteExpiredStoreFiles/011a93cc4f763307dc36f577662db4b1/family/7ebf7908f5044fcab65a7327d3e19405 whose maxTimeStamp is 1331745552939 while the max expired timestamp is 1331745553923
{code}

{code}
for (int i = 1; i <= storeFileNum; i++) {
      // verify the expired store file.
      CompactionRequest cr = this.store.requestCompaction();
      assertEquals(1, cr.getFiles().size());
      assertTrue(cr.getFiles().get(0).getReader().getMaxTimestamp() < 
          (System.currentTimeMillis() - this.store.scanInfo.getTtl()));
      // Verify that the expired the store has been deleted.
      this.store.compact(cr);
      assertEquals(storeFileNum - i, this.store.getStorefiles().size());

      // Let the next store file expired.
      Thread.sleep(sleepTime);
    }{code}

after the first compaction,
the next call this.store.requestCompaction(), the other three store files are  all expired at one time.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-5568:
--------------------------------

    Attachment: HBASE-5568.patch
    
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "ramkrishna.s.vasudevan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ramkrishna.s.vasudevan updated HBASE-5568:
------------------------------------------

    Fix Version/s: 0.90.7

Updated 0.90.7 also in fix versions.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230910#comment-13230910 ] 

Hadoop QA commented on HBASE-5568:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518615/HBASE-5568v2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.client.TestAdmin
                  org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1197//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1197//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1197//console

This message is automatically generated.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228198#comment-13228198 ] 

chunhui shen commented on HBASE-5568:
-------------------------------------

@Lars
In current flushcache() logic, we consider three continuous call flushcache().
1.The first, call internalFlushcache(status) successfully.
2.The second, because writestate.flushing=true, so it will return false, but it will set writestate.flushing = false finally.
3.The third, will call internalFlushcache(status) successfully again.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13232488#comment-13232488 ] 

Hudson commented on HBASE-5568:
-------------------------------

Integrated in HBase-0.92 #329 (See [https://builds.apache.org/job/HBase-0.92/329/])
    HBASE-5568 Multi concurrent flushcache() for one region could cause data loss (Chunhui) (Revision 1302270)

     Result = FAILURE
tedyu : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java

                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231859#comment-13231859 ] 

Hudson commented on HBASE-5568:
-------------------------------

Integrated in HBase-TRUNK-security #140 (See [https://builds.apache.org/job/HBase-TRUNK-security/140/])
    HBASE-5568 Multi concurrent flushcache() for one region could cause data loss (Chunhui) (Revision 1301639)

     Result = SUCCESS
tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java

                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhihong Yu updated HBASE-5568:
------------------------------

    Status: Patch Available  (was: Open)
    
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229013#comment-13229013 ] 

ramkrishna.s.vasudevan commented on HBASE-5568:
-----------------------------------------------

As per the comments on HBASE-5312 i feel the patch makes sense.
I am +1 on it. It is needed in 0.90 also?


                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>         Attachments: HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-5568:
--------------------------------

    Attachment: HBASE-5568v2.patch
    
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Zhihong Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229478#comment-13229478 ] 

Zhihong Yu commented on HBASE-5568:
-----------------------------------

In getGetNumCurrentReplicas():
{code}
      LOG.info("getNumCurrentReplicas--HDFS-826 not available; hdfs_out=" +
        os, exception);
{code}
I think we should omit exception if exception is instanceof NoSuchMethodException because the log is at info level.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Hadoop QA (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230940#comment-13230940 ] 

Hadoop QA commented on HBASE-5568:
----------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12518621/HBASE-5568v2.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 3 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac compiler warnings.

    -1 findbugs.  The patch appears to introduce 161 new Findbugs (version 1.3.9) warnings.

    +1 release audit.  The applied patch does not increase the total number of release audit warnings.

     -1 core tests.  The patch failed these unit tests:
                       org.apache.hadoop.hbase.mapreduce.TestImportTsv
                  org.apache.hadoop.hbase.mapred.TestTableMapReduce
                  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1198//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1198//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1198//console

This message is automatically generated.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "Lars Hofhansl (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13231499#comment-13231499 ] 

Lars Hofhansl commented on HBASE-5568:
--------------------------------------

Thanks Ted.
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "chunhui shen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229843#comment-13229843 ] 

chunhui shen commented on HBASE-5568:
-------------------------------------

I have run org.apache.hadoop.hbase.regionserver.TestStore many times on my local pc, all passed.
Is there any other problem?
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568.patch, HBASE-5568.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-5568) Multi concurrent flushcache() for one region could cause data loss

Posted by "chunhui shen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-5568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

chunhui shen updated HBASE-5568:
--------------------------------

    Attachment: HBASE-5568-92v2.patch

patch for 92 version
                
> Multi concurrent flushcache() for one region could cause data loss
> ------------------------------------------------------------------
>
>                 Key: HBASE-5568
>                 URL: https://issues.apache.org/jira/browse/HBASE-5568
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>
>         Attachments: HBASE-5568-90.patch, HBASE-5568-92v2.patch, HBASE-5568.patch, HBASE-5568.patch, HBASE-5568v2.patch
>
>
> We could call HRegion#flushcache() concurrently now through HRegionServer#splitRegion or HRegionServer#flushRegion by HBaseAdmin.
> However, we find if HRegion#internalFlushcache() is called concurrently by multi thread, HRegion.memstoreSize will be calculated wrong.
> At the end of HRegion#internalFlushcache(), we will do this.addAndGetGlobalMemstoreSize(-flushsize), but the flushsize may not the actual memsize which flushed to hdfs. It cause HRegion.memstoreSize is negative and prevent next flush if we close this region.
> Logs in RS for region e9d827913a056e696c39bc569ea3
> 2012-03-11 16:31:36,690 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 128.0m
> 2012-03-11 16:31:37,999 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/8162481165586107427, entries=153106, sequenceid=619316544, memsize=59.6m, filesize=31.2m
> 2012-03-11 16:31:38,830 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 134.8m
> 2012-03-11 16:31:39,458 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/3425971951499794221, entries=230183, sequenceid=619316544, memsize=68.5m, filesize=26.6m
> 2012-03-11 16:31:39,459 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~128.1m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 2769ms, sequenceid=619316544, compaction requested=false
> 2012-03-11 16:31:39,459 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush for writetest1,,1331454657410.e9d827913a056e696c39bc569ea3
> f99f., current region memstore size 6.8m
> 2012-03-11 16:31:39,529 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/1811012969998104626, entries=8002, sequenceid=619332759, memsize=3.1m, filesize=1.6m
> 2012-03-11 16:31:39,640 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/770333473623552048, entries=12231, sequenceid=619332759, memsize=3.6m, filesize=1.4m
> 2012-03-11 16:31:39,641 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~134.8m for region writetest1,,1331454657410.e9d827913a
> 056e696c39bc569ea3f99f. in 811ms, sequenceid=619332759, compaction requested=true
> 2012-03-11 16:31:39,707 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf1/5656568849587368557, entries=119, sequenceid=619332979, memsize=47.4k, filesize=25.6k
> 2012-03-11 16:31:39,775 INFO org.apache.hadoop.hbase.regionserver.Store: Added hdfs://dw74.kgb.sqa.cm4:9700/hbase-func1/writetest1/e9d827913a056e696c39bc569e
> a3f99f/cf2/794343845650987521, entries=157, sequenceid=619332979, memsize=47.8k, filesize=19.3k
> 2012-03-11 16:31:39,777 INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush of ~6.8m for region writetest1,,1331454657410.e9d827913a05
> 6e696c39bc569ea3f99f. in 318ms, sequenceid=619332979, compaction requested=true

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira