You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "stack (JIRA)" <ji...@apache.org> on 2007/05/31 02:03:15 UTC

[jira] Created: (HADOOP-1445) Support updates across region splits and compactions

Support updates across region splits and compactions
----------------------------------------------------

                 Key: HADOOP-1445
                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
             Project: Hadoop
          Issue Type: Bug
          Components: contrib/hbase
            Reporter: stack
            Assignee: stack


Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.

During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Work started: (HADOOP-1445) Support updates across region splits and compactions

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on HADOOP-1445 started by stack.

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1445) Support updates across region splits and compactions

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1445:
--------------------------

    Status: Patch Available  (was: In Progress)

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1445) Support updates across region splits and compactions

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman updated HADOOP-1445:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks Michael.

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445-v2.patch, hadoop1445-v2.patch~, hadoop1445.patch, hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1445) Support updates across region splits and compactions

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1445:
--------------------------

    Status: Patch Available  (was: In Progress)

Pass patch via hudson build

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445-v2.patch, hadoop1445-v2.patch~, hadoop1445.patch, hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1445) Support updates across region splits and compactions

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1445:
--------------------------

    Attachment: hadoop1445.patch

Fix for updates failing across compactions and region splits.

Commit message:

HADOOP-1445 "Support updates across region splits and compactions".

High-level:

* numRetries wasn't set in HBaseRegion so we'd never update table after a split.
* Scanners were failing to go past first deleted value
* Large compactions were failing because remote held unnecessary write lock
while compaction was going on (long compactions caused client timeout).
* Region shutdown is now two stage: makes it so outstanding transactions
can finish up but new requests to open a transaction are rejects
(NotServingRegionException -- perhaps should be a message in place of
exception).
* Client retries are now just one variable rather than split over two.
* clientTimeout was happening even when success finding server in
scanOneMetaRegion making it timesouts happen even on successful updates
* Looking for wrong exception to trigger retries: was
NotServingRegionException when should have been a RemoteException bearing
a NSRE. 
* Moved to use org.apache.hadoop.io.retry for remote retries.
* Removed bunch of eclipse  'synthetic accessor' warnings by making items
default access rather than private.

Upshot is that now EvaluationClient sequentialWrite runs to completion (~20
minutes of continuous writing across multiple compactions and region splits).

Detail:
* src/contrib/hbase/conf/hbase-default.xml
    (hbase.client.timeout.length): Default should be 30 seconds, not 10.
    (hbase.client.timeout.number): Removed. Instead have just one retry,
    hbase.client.retries.number, rather than two.
    (hbase.client.retries.number): Upped default from 2 to 4.
    (hbase.regionserver.lease.period): Upped from 30 seconds to 180.
    (hbase.regionserver.maxlogentries, hbase.hregion.maxunflushed,
    hbase.hregion.max.filesize): Added.
* src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java
    Removed try/catch around regionservers.stop (no longer necessary).
* src/contrib/hbase/src/test/org/apache/hadoop/hbase/EvaluationClient.java
    Fix accidental commit where ONE_HUNDRED_MB was actually 1MB.
    Generally change name of 'range' variable to 'i'.
    Add more help to usage.
* src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestScanner2.java
    Added. Tests scans where a row has been deleted.
* src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestCompare.java
    Added.  Test compare of HBase types.
* src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestScanner.java
    Javadoc.
* src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHRegion.java
    Name of region row lock method changed from obtainLock to
    obtainRowLock.  Closing of a region became two-stepped.
    Implement listeners.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
    Javadoc. Replace verbose 'for' setup w/ terse 1.5 foreach.
    Removed odd try/catch surround of empty space.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMerge.java
    Read max file size from configuration.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/Leases.java
    Line lengths.  Make < 80.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/RegionNotFoundException.java
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/WrongRegionException.java
    Added.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
    Make imports specific.
    Change a bunch of data members from private to default access
    (Eclipse promise performance improvements when doesn't have to
    create 'synthetic accessors' when member or method is private)
    Renamed regions as onlineRegions.  Added notion of retiringRegions
    to support completion of outstanding transactions (new ones are
    rejected).
    (UpdateMetaInteface): Added.  Needed by org.apache.hadoop.io.retry
    used to cover splits (in case rety necessary).
    (SplitOrCompactChecker): Refactored.  Broke most of the run method
    out into a new split method.
    Refactored how split works.  Moved actual addRegion/deleteRegions
    out to HRegion as utility methods (Jim just before this commit
    did similar -- this builds on his work and moves even more utility
    to HRegion.  Useful building test cases).
    Changed logging at end of all threads so if exits, log level is at
    INFO, rather than DEBUG, so we notice thread exit (Did for flusher,
    splitter, logroller threads).
    (stop): Remove unthrown IOE.
    (get): Refactored.  Removed test for result != null so we could
    throw either result or null.
    (next): If all columns in a row are deleted values, get another
    row (previous we would return the a row with no values and scanner
    would shut down -- making it so we never crossed-out a row
    with deleted values).
    (getRegion): Added an override with flag as to whether to look
    in retiring list of regions.
    Added logging of expired scanner leases.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java
    Renamed DESIRED_MAX_FILE_SIZE as DEFAULT_MAX_FILE_SIZE.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HClient.java
    Renamed TableInfo data structure as RegionLocation.  More descriptive.
    Added RegionLocation#toString.
    Removed numTimeouts. Use numRetries everywhere for retries instead.
    Renamed clientTimeout as pause.
    Log enabled, disabled, delete of tables.
    (startUpdate): Refactored to use org.apache.hadoop.io.retry.
    (printCreateTableUsage, printDeleteTableUsage): Added.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Moved HRegion utility methods -- getRegionInfo, getServerName,
    and getStartCode -- to HRegion.
    Added catch handler for new UnknownScannerException to META scanners.
    Use new HRegion.createRegion utility.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java
    (comparableTo): Replace w/ implementation that sorts regions with
    lower startKeys before those of higher startKeys.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java
    Line-lengths, iterators and javadoc.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HAbstractScanner.java
    Changed access to default from private for members accessed out of
    inner classes.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/UnknownScannerException.java
    Added.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    During compaction, don't need a write lock for the duration -- just
    need to block flushing to disk.  Use the WriteState structure to put up
    this guard rather than the broader write lock (write lock was stopping
    updates coming in -- they'd timeout if compaction was taking too long).
    Make max size of region configurable.
    Replaced iterators with terser foreach version.
    (obtainLock, releaseLock): Renamed as obtainRowLock, releaseRowLock.
    (waitOnRowLocks): Added.
    (createHRegion, addRegionToMETA, removeRegionFromMETA,
    getServerName, getStartCode): Added.
* src/contrib/hbase/src/java/org/apache/hadoop/hbase/RegionUnavailableListener.java
    Changed move of region to unavailable to be a two-stepped process.
    (regionIsUnavailable): Removed.
    (closing, closed): Added.

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1445) Support updates across region splits and compactions

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1445:
--------------------------

    Attachment: hadoop1445-v2.patch~

Version 2. Fixes javadoc warnings.

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445-v2.patch~, hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1445) Support updates across region splits and compactions

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501101 ] 

Hadoop QA commented on HADOOP-1445:
-----------------------------------

-1, new javadoc warnings

The javadoc tool appears to have generated warning messages when testing the latest attachment http://issues.apache.org/jira/secure/attachment/12358798/hadoop1445.patch against trunk revision r543862.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/241/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/241/console

Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1445) Support updates across region splits and compactions

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501265 ] 

Hadoop QA commented on HADOOP-1445:
-----------------------------------

+1

http://issues.apache.org/jira/secure/attachment/12358876/hadoop1445-v2.patch applied and successfully tested against trunk revision r543862.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/242/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/242/console

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445-v2.patch, hadoop1445-v2.patch~, hadoop1445.patch, hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1445) Support updates across region splits and compactions

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1445:
--------------------------

    Attachment: hadoop1445-v2.patch

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445-v2.patch, hadoop1445-v2.patch~, hadoop1445.patch, hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1445) Support updates across region splits and compactions

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1445:
--------------------------

    Attachment: hadoop1445.patch

v2 w/o tilde

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445-v2.patch, hadoop1445-v2.patch~, hadoop1445.patch, hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1445) Support updates across region splits and compactions

Posted by "stack (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HADOOP-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HADOOP-1445:
--------------------------

    Status: In Progress  (was: Patch Available)

> Support updates across region splits and compactions
> ----------------------------------------------------
>
>                 Key: HADOOP-1445
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1445
>             Project: Hadoop
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: stack
>         Attachments: hadoop1445-v2.patch~, hadoop1445.patch
>
>
> Running an extended serial write test (EvaluationClient), a test that runs long enough it triggers compactions and splits, hbase falls over.  In the former case, client gets locked out.  In the latter, the client times out.
> During compactions, hbase should be able to continue to take writes/updates.  During splits, the client will have to recalibrate so writes go instead into the new splits but again, other than a pause, writes shouldn't be dropped.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.