You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2008/06/25 19:46:47 UTC

[jira] Created: (HBASE-707) High-load import of data into single table/family never triggers split

High-load import of data into single table/family never triggers split
----------------------------------------------------------------------

                 Key: HBASE-707
                 URL: https://issues.apache.org/jira/browse/HBASE-707
             Project: Hadoop HBase
          Issue Type: Bug
    Affects Versions: 0.1.3
         Environment: Linux 2.6.25-14.fc9.x86_64, Fedora Core 9
            Reporter: Jonathan Gray
             Fix For: 0.1.3


Importing a heavy amount of data into a single table and family.

One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.

Currently there is a single mapfile containing nearly 10GB.

Eventually this has caused regions to crash with OOME, as described in HBASE-706

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Closed: (HBASE-707) High-load import of data into single table/family never triggers split

Posted by "Jim Kellerman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jim Kellerman closed HBASE-707.
-------------------------------


> High-load import of data into single table/family never triggers split
> ----------------------------------------------------------------------
>
>                 Key: HBASE-707
>                 URL: https://issues.apache.org/jira/browse/HBASE-707
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.3
>         Environment: Linux 2.6.25-14.fc9.x86_64, Fedora Core 9
>            Reporter: Jonathan Gray
>            Assignee: stack
>             Fix For: 0.1.3
>
>         Attachments: 707.patch
>
>
> Importing a heavy amount of data into a single table and family.
> One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.
> Currently there is a single mapfile containing nearly 10GB.
> Eventually this has caused regions to crash with OOME, as described in HBASE-706
> Table in question:
> hql > describe items;
> +-----------------------------------------------------------------------------+
> | Column Family Descriptor                                                    |
> +-----------------------------------------------------------------------------+
> | name: cfrecs, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: clusters, max versions: 2, compression: NONE, in memory: false, max le|
> | ngth: 2147483647, bloom filter: none                                        |
> +-----------------------------------------------------------------------------+
> | name: content, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: readby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: receivedby, max versions: 2, compression: NONE, in memory: false, max |
> | length: 2147483647, bloom filter: none                                      |
> +-----------------------------------------------------------------------------+
> | name: savedby, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: sentby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> 7 columnfamily(s) in set. (0.34 sec)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-707) High-load import of data into single table/family never triggers split

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608229#action_12608229 ] 

Jonathan Gray commented on HBASE-707:
-------------------------------------

Recreated entire use case scenario and the issue is gone.  We are now seeing normal region splits.

However, we have experienced a new behavior during those splits.  We are writing and the client receives an IllegalStateException:

Trying to commit: Exception in thread "main" java.lang.RuntimeException: java.lang.IllegalStateException: region offline: items,823ce1e3-d414-474f-ac70-c4081cecef0f,1214434560891
 at org.apache.hadoop.hbase.HTable.getRegionServerWithRetries(HTable.java:1062)
 at org.apache.hadoop.hbase.HTable.commit(HTable.java:763)
 at org.apache.hadoop.hbase.HTable.commit(HTable.java:744)
 at HBase.AddAttributes(HBase.java:220)
 at PoJaMigratorItems.Add(PoJaMigratorItems.java:143)
 at PoJaMigrator.AddItems(PoJaMigrator.java:123)
 at PoJaMigrator.AddAllData(PoJaMigrator.java:57)
 at PoJaMigrator.<init>(PoJaMigrator.java:27)
 at PoJaMigrator.main(PoJaMigrator.java:35)
Caused by: java.lang.IllegalStateException: region offline: items,823ce1e3-d414-474f-ac70-c4081cecef0f,1214434560891
 at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegionInMeta(HConnectionManager.java:438)
 at org.apache.hadoop.hbase.HConnectionManager$TableServers.locateRegion(HConnectionManager.java:350)
 at org.apache.hadoop.hbase.HConnectionManager$TableServers.relocateRegion(HConnectionManager.java:318)
 at org.apache.hadoop.hbase.HTable.getRegionLocation(HTable.java:114)
 at org.apache.hadoop.hbase.HTable$ServerCallable.instantiateServer(HTable.java:1021)
 at org.apache.hadoop.hbase.HTable.getRegionServerWithRetries(HTable.java:1036)
 ... 8 more


> High-load import of data into single table/family never triggers split
> ----------------------------------------------------------------------
>
>                 Key: HBASE-707
>                 URL: https://issues.apache.org/jira/browse/HBASE-707
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.3
>         Environment: Linux 2.6.25-14.fc9.x86_64, Fedora Core 9
>            Reporter: Jonathan Gray
>             Fix For: 0.1.3
>
>         Attachments: 707.patch
>
>
> Importing a heavy amount of data into a single table and family.
> One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.
> Currently there is a single mapfile containing nearly 10GB.
> Eventually this has caused regions to crash with OOME, as described in HBASE-706
> Table in question:
> hql > describe items;
> +-----------------------------------------------------------------------------+
> | Column Family Descriptor                                                    |
> +-----------------------------------------------------------------------------+
> | name: cfrecs, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: clusters, max versions: 2, compression: NONE, in memory: false, max le|
> | ngth: 2147483647, bloom filter: none                                        |
> +-----------------------------------------------------------------------------+
> | name: content, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: readby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: receivedby, max versions: 2, compression: NONE, in memory: false, max |
> | length: 2147483647, bloom filter: none                                      |
> +-----------------------------------------------------------------------------+
> | name: savedby, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: sentby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> 7 columnfamily(s) in set. (0.34 sec)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-707) High-load import of data into single table/family never triggers split

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-707:
------------------------

    Attachment: 707.patch

Have been working with John on his cluster on this issue.  This patch seems to fix the issue (more testing to do).

Splits are triggered if the compaction run returns true.  The return up out of compaction was coming up from the depths of store file and on the way could be mangled if multiple families in a region; one might compact but the subsequent one might not.  Because of the latter, we'd not run split check.

> High-load import of data into single table/family never triggers split
> ----------------------------------------------------------------------
>
>                 Key: HBASE-707
>                 URL: https://issues.apache.org/jira/browse/HBASE-707
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.3
>         Environment: Linux 2.6.25-14.fc9.x86_64, Fedora Core 9
>            Reporter: Jonathan Gray
>             Fix For: 0.1.3
>
>         Attachments: 707.patch
>
>
> Importing a heavy amount of data into a single table and family.
> One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.
> Currently there is a single mapfile containing nearly 10GB.
> Eventually this has caused regions to crash with OOME, as described in HBASE-706
> Table in question:
> hql > describe items;
> +-----------------------------------------------------------------------------+
> | Column Family Descriptor                                                    |
> +-----------------------------------------------------------------------------+
> | name: cfrecs, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: clusters, max versions: 2, compression: NONE, in memory: false, max le|
> | ngth: 2147483647, bloom filter: none                                        |
> +-----------------------------------------------------------------------------+
> | name: content, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: readby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: receivedby, max versions: 2, compression: NONE, in memory: false, max |
> | length: 2147483647, bloom filter: none                                      |
> +-----------------------------------------------------------------------------+
> | name: savedby, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: sentby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> 7 columnfamily(s) in set. (0.34 sec)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-707) High-load import of data into single table/family never triggers split

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608240#action_12608240 ] 

stack commented on HBASE-707:
-----------------------------

Thanks for confirming patch Jon. The ISE is because your clocks are way skewed.   Will fix that over in HBASE-710  I'll commit this patch later tonight.

> High-load import of data into single table/family never triggers split
> ----------------------------------------------------------------------
>
>                 Key: HBASE-707
>                 URL: https://issues.apache.org/jira/browse/HBASE-707
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.3
>         Environment: Linux 2.6.25-14.fc9.x86_64, Fedora Core 9
>            Reporter: Jonathan Gray
>             Fix For: 0.1.3
>
>         Attachments: 707.patch
>
>
> Importing a heavy amount of data into a single table and family.
> One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.
> Currently there is a single mapfile containing nearly 10GB.
> Eventually this has caused regions to crash with OOME, as described in HBASE-706
> Table in question:
> hql > describe items;
> +-----------------------------------------------------------------------------+
> | Column Family Descriptor                                                    |
> +-----------------------------------------------------------------------------+
> | name: cfrecs, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: clusters, max versions: 2, compression: NONE, in memory: false, max le|
> | ngth: 2147483647, bloom filter: none                                        |
> +-----------------------------------------------------------------------------+
> | name: content, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: readby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: receivedby, max versions: 2, compression: NONE, in memory: false, max |
> | length: 2147483647, bloom filter: none                                      |
> +-----------------------------------------------------------------------------+
> | name: savedby, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: sentby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> 7 columnfamily(s) in set. (0.34 sec)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-707) High-load import of data into single table/family never triggers split

Posted by "Jonathan Gray (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Gray updated HBASE-707:
--------------------------------

    Description: 
Importing a heavy amount of data into a single table and family.

One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.

Currently there is a single mapfile containing nearly 10GB.

Eventually this has caused regions to crash with OOME, as described in HBASE-706


Table in question:

hql > describe items;
+-----------------------------------------------------------------------------+
| Column Family Descriptor                                                    |
+-----------------------------------------------------------------------------+
| name: cfrecs, max versions: 2, compression: NONE, in memory: false, max leng|
| th: 2147483647, bloom filter: none                                          |
+-----------------------------------------------------------------------------+
| name: clusters, max versions: 2, compression: NONE, in memory: false, max le|
| ngth: 2147483647, bloom filter: none                                        |
+-----------------------------------------------------------------------------+
| name: content, max versions: 2, compression: NONE, in memory: false, max len|
| gth: 2147483647, bloom filter: none                                         |
+-----------------------------------------------------------------------------+
| name: readby, max versions: 2, compression: NONE, in memory: false, max leng|
| th: 2147483647, bloom filter: none                                          |
+-----------------------------------------------------------------------------+
| name: receivedby, max versions: 2, compression: NONE, in memory: false, max |
| length: 2147483647, bloom filter: none                                      |
+-----------------------------------------------------------------------------+
| name: savedby, max versions: 2, compression: NONE, in memory: false, max len|
| gth: 2147483647, bloom filter: none                                         |
+-----------------------------------------------------------------------------+
| name: sentby, max versions: 2, compression: NONE, in memory: false, max leng|
| th: 2147483647, bloom filter: none                                          |
+-----------------------------------------------------------------------------+
7 columnfamily(s) in set. (0.34 sec)


  was:
Importing a heavy amount of data into a single table and family.

One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.

Currently there is a single mapfile containing nearly 10GB.

Eventually this has caused regions to crash with OOME, as described in HBASE-706


Added table description

> High-load import of data into single table/family never triggers split
> ----------------------------------------------------------------------
>
>                 Key: HBASE-707
>                 URL: https://issues.apache.org/jira/browse/HBASE-707
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.3
>         Environment: Linux 2.6.25-14.fc9.x86_64, Fedora Core 9
>            Reporter: Jonathan Gray
>             Fix For: 0.1.3
>
>
> Importing a heavy amount of data into a single table and family.
> One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.
> Currently there is a single mapfile containing nearly 10GB.
> Eventually this has caused regions to crash with OOME, as described in HBASE-706
> Table in question:
> hql > describe items;
> +-----------------------------------------------------------------------------+
> | Column Family Descriptor                                                    |
> +-----------------------------------------------------------------------------+
> | name: cfrecs, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: clusters, max versions: 2, compression: NONE, in memory: false, max le|
> | ngth: 2147483647, bloom filter: none                                        |
> +-----------------------------------------------------------------------------+
> | name: content, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: readby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: receivedby, max versions: 2, compression: NONE, in memory: false, max |
> | length: 2147483647, bloom filter: none                                      |
> +-----------------------------------------------------------------------------+
> | name: savedby, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: sentby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> 7 columnfamily(s) in set. (0.34 sec)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-707) High-load import of data into single table/family never triggers split

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-707.
-------------------------

    Resolution: Fixed

Committed to branch.  Trunk doesn't have this issue.  It has another: 

> High-load import of data into single table/family never triggers split
> ----------------------------------------------------------------------
>
>                 Key: HBASE-707
>                 URL: https://issues.apache.org/jira/browse/HBASE-707
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.3
>         Environment: Linux 2.6.25-14.fc9.x86_64, Fedora Core 9
>            Reporter: Jonathan Gray
>             Fix For: 0.1.3
>
>         Attachments: 707.patch
>
>
> Importing a heavy amount of data into a single table and family.
> One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.
> Currently there is a single mapfile containing nearly 10GB.
> Eventually this has caused regions to crash with OOME, as described in HBASE-706
> Table in question:
> hql > describe items;
> +-----------------------------------------------------------------------------+
> | Column Family Descriptor                                                    |
> +-----------------------------------------------------------------------------+
> | name: cfrecs, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: clusters, max versions: 2, compression: NONE, in memory: false, max le|
> | ngth: 2147483647, bloom filter: none                                        |
> +-----------------------------------------------------------------------------+
> | name: content, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: readby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: receivedby, max versions: 2, compression: NONE, in memory: false, max |
> | length: 2147483647, bloom filter: none                                      |
> +-----------------------------------------------------------------------------+
> | name: savedby, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: sentby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> 7 columnfamily(s) in set. (0.34 sec)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-707) High-load import of data into single table/family never triggers split

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-707:
---------------------------

    Assignee: stack

> High-load import of data into single table/family never triggers split
> ----------------------------------------------------------------------
>
>                 Key: HBASE-707
>                 URL: https://issues.apache.org/jira/browse/HBASE-707
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.1.3
>         Environment: Linux 2.6.25-14.fc9.x86_64, Fedora Core 9
>            Reporter: Jonathan Gray
>            Assignee: stack
>             Fix For: 0.1.3
>
>         Attachments: 707.patch
>
>
> Importing a heavy amount of data into a single table and family.
> One column in that family (the same fam:col for every row) contains a frequently large amount of UTF-8 data.  This column grows and grows but never causes a region split.
> Currently there is a single mapfile containing nearly 10GB.
> Eventually this has caused regions to crash with OOME, as described in HBASE-706
> Table in question:
> hql > describe items;
> +-----------------------------------------------------------------------------+
> | Column Family Descriptor                                                    |
> +-----------------------------------------------------------------------------+
> | name: cfrecs, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: clusters, max versions: 2, compression: NONE, in memory: false, max le|
> | ngth: 2147483647, bloom filter: none                                        |
> +-----------------------------------------------------------------------------+
> | name: content, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: readby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> | name: receivedby, max versions: 2, compression: NONE, in memory: false, max |
> | length: 2147483647, bloom filter: none                                      |
> +-----------------------------------------------------------------------------+
> | name: savedby, max versions: 2, compression: NONE, in memory: false, max len|
> | gth: 2147483647, bloom filter: none                                         |
> +-----------------------------------------------------------------------------+
> | name: sentby, max versions: 2, compression: NONE, in memory: false, max leng|
> | th: 2147483647, bloom filter: none                                          |
> +-----------------------------------------------------------------------------+
> 7 columnfamily(s) in set. (0.34 sec)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.