You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Sujee Maniyam (JIRA)" <ji...@apache.org> on 2011/09/19 20:33:08 UTC

[jira] [Created] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

add an option to presplit table to PerformanceEvaluation
--------------------------------------------------------

                 Key: HBASE-4440
                 URL: https://issues.apache.org/jira/browse/HBASE-4440
             Project: HBase
          Issue Type: Improvement
          Components: performance
            Reporter: Sujee Maniyam
            Priority: Minor


PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.

It would be nice to have an option to enable pre-splitting table before the inserts begin.

it would look something like:

(a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
(b) hbase ...PerformanceEvaluation   --presplit <other options>

(b) will try to presplit the table on some default value (say number of region servers)


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sujee Maniyam updated HBASE-4440:
---------------------------------

    Attachment: PerformanceEvaluation_HBASE_4440.patch

patch attached
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>         Attachments: PerformanceEvaluation_HBASE_4440.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Jean-Daniel Cryans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184294#comment-13184294 ] 

Jean-Daniel Cryans commented on HBASE-4440:
-------------------------------------------

bq. whether we use presplit option or not, table has to be recreated for all write-mode tests.

No, it shouldn't be different from the default behavior of not recreating the table.

bq. or pre-split should try to split the table without re-creating it.

It should not.

Code speaks more than words, here's what I'm using for testing 0.92:

{code}
  private boolean checkTable(HBaseAdmin admin) throws IOException {
    HTableDescriptor tableDescriptor = getTableDescriptor();
    boolean tableExists = admin.tableExists(tableDescriptor.getName());
    if (!tableExists) {
      if (this.presplitRegions > 0) {
        byte[][] splits = getSplits();
        for (int i=0; i < splits.length; i++) {
          LOG.debug(" split " + i + ": " + Bytes.toStringBinary(splits[i]));
        }
        admin.createTable(tableDescriptor, splits);
        LOG.info ("Table created with " + this.presplitRegions + " splits");
      }
      else {
        admin.createTable(tableDescriptor);
        LOG.info("Table " + tableDescriptor + " created");
      }
    }
    return !tableExists;
  }
{code}
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Nicolas Spiegelberg (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicolas Spiegelberg resolved HBASE-4440.
----------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.94.0

thanks Sujee!
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Jean-Daniel Cryans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184311#comment-13184311 ] 

Jean-Daniel Cryans commented on HBASE-4440:
-------------------------------------------

We could show a WARN, but I don't think we would need more than that. In fact, we could always show a message when the table exists saying something like: "Using the existing ${tablename} which has ${X} regions". 

About the pre-splitting itself, it seems that it creates N+1 regions and the first one has the end key 0000000000 so it never gets data. Not a biggie, but could be fixed in another jira.
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184309#comment-13184309 ] 

Sujee Maniyam commented on HBASE-4440:
--------------------------------------

I see.  looks good.
If the table exists, and presplit option is supplied, it will have no effect.  It might mislead the user in believing the pre-split option took effect, while in fact it didn't.
may be a WARN would suffice to notify the user?
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Jean-Daniel Cryans (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans reassigned HBASE-4440:
-----------------------------------------

    Assignee: Jean-Daniel Cryans
    
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Jean-Daniel Cryans
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sujee Maniyam updated HBASE-4440:
---------------------------------

    Attachment: PerformanceEvaluation_HBASE_4440.patch

patch
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sujee Maniyam updated HBASE-4440:
---------------------------------

    Component/s:     (was: performance)
                 util

> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184329#comment-13184329 ] 

Sujee Maniyam commented on HBASE-4440:
--------------------------------------

sounds good.  I will submit a patch.

couple of newbie logistical questions:

1) should I create a new patch against the trunk?  the original patch is already committed in trunk / 0.94.

2) do I leave old patch attachments in the JIRA or should I delete them (to reduce clutter)

thanks JD
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Jean-Daniel Cryans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183819#comment-13183819 ] 

Jean-Daniel Cryans commented on HBASE-4440:
-------------------------------------------

Yes, and if you don't use it it's the old behavior of not recreating the table. Presplitting shouldn't be different.
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sujee Maniyam updated HBASE-4440:
---------------------------------

    Attachment:     (was: PerformanceEvaluation_HBASE_4440.patch)
    
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>         Attachments: PerformanceEvaluation.java
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Jean-Daniel Cryans (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jean-Daniel Cryans reassigned HBASE-4440:
-----------------------------------------

    Assignee: Sujee Maniyam  (was: Jean-Daniel Cryans)

ugh pressed the wrong button and assigned it to myself, I added Sujee as a contributor reassigned this jira.
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Nicolas Spiegelberg (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160552#comment-13160552 ] 

Nicolas Spiegelberg commented on HBASE-4440:
--------------------------------------------

looks pretty good.  I wish you could use RegionSplitter.HexStringSplit to generate your keys (since that would be the recommended usage pattern), but I guess PerformanceEvaluation does DecimalStrings instead.  Maybe a refactoring effort?

The line
{code}
Thread.sleep(3000); // wait for things to settle down
{code}

deleteTable() should stall until all all the regions associated with that table are removed.  Did you encounter some problems without that sleep?  We shouldn't have any arbitrary stalls in the code and instead understand what else we need to wait on before creating the table again.
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sujee Maniyam updated HBASE-4440:
---------------------------------

    Attachment:     (was: PerformanceEvaluation.java)
    
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163372#comment-13163372 ] 

Hudson commented on HBASE-4440:
-------------------------------

Integrated in HBase-TRUNK-security #23 (See [https://builds.apache.org/job/HBase-TRUNK-security/23/])
    HBASE-4440 add an option to presplit table to PerformanceEvaluation

nspiegelberg : 
Files : 
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java

                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sujee Maniyam updated HBASE-4440:
---------------------------------

    Attachment: PerformanceEvaluation_HBASE_4440_2.patch

thanks for the comments.

I have attached a revised patch.

removed sleep-wait after deleting table.  I had some issues with 0.90.1, works fine with 0.90.4
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sujee Maniyam updated HBASE-4440:
---------------------------------

    Attachment: PerformanceEvaluation.java
    
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184284#comment-13184284 ] 

Sujee Maniyam commented on HBASE-4440:
--------------------------------------

so you are proposing that  

1) whether we use presplit option or not, table has to be recreated for all write-mode tests.  

This changes the behavior for all write-tests.   Currently table is only created if it doesn't exist.

2) or pre-split should try to split the table without re-creating it.



                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sujee Maniyam updated HBASE-4440:
---------------------------------

    Attachment: PerformanceEvaluation.java

added --presplit option to pre-split table
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>         Attachments: PerformanceEvaluation.java
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "ramkrishna.s.vasudevan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160575#comment-13160575 ] 

ramkrishna.s.vasudevan commented on HBASE-4440:
-----------------------------------------------

@Sujee

Wait till there not regions in RIT instead of sleep like how we have done in TestMasterFailover

{code}
log("Waiting for no more RIT");
    ZKAssign.blockUntilNoRIT(zkw);
{code}
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Jean-Daniel Cryans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183784#comment-13183784 ] 

Jean-Daniel Cryans commented on HBASE-4440:
-------------------------------------------

I tried the patch on 0.92, why is it that using presplit you get a different behavior than while not using it regarding the re-creation of the table? Seems like a bug to me.
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Hudson (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163416#comment-13163416 ] 

Hudson commented on HBASE-4440:
-------------------------------

Integrated in HBase-TRUNK #2520 (See [https://builds.apache.org/job/HBase-TRUNK/2520/])
    HBASE-4440 add an option to presplit table to PerformanceEvaluation

nspiegelberg : 
Files : 
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java

                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Sujee Maniyam (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183805#comment-13183805 ] 

Sujee Maniyam commented on HBASE-4440:
--------------------------------------

@JD
--presplit  drops and re-creates the table with splits.
is this what you mean?
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Nicolas Spiegelberg (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160579#comment-13160579 ] 

Nicolas Spiegelberg commented on HBASE-4440:
--------------------------------------------

@ramkrishna: should this not be done in the deleteTable() code to make it synchronous instead of expecting all callers to know that it is necessary?
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-4440) add an option to presplit table to PerformanceEvaluation

Posted by "Jean-Daniel Cryans (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184342#comment-13184342 ] 

Jean-Daniel Cryans commented on HBASE-4440:
-------------------------------------------

Please create a new jira, carry over some of our conversations we had here to justify it, and leave this jira like it is please.

Good stuff.
                
> add an option to presplit table to PerformanceEvaluation
> --------------------------------------------------------
>
>                 Key: HBASE-4440
>                 URL: https://issues.apache.org/jira/browse/HBASE-4440
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Sujee Maniyam
>            Assignee: Sujee Maniyam
>            Priority: Minor
>              Labels: benchmark
>             Fix For: 0.94.0
>
>         Attachments: PerformanceEvaluation.java, PerformanceEvaluation_HBASE_4440.patch, PerformanceEvaluation_HBASE_4440_2.patch
>
>
> PerformanceEvaluation a quick way to 'benchmark' a HBase cluster.  The current 'write*' operations do not pre-split the table.  Pre splitting the table will really boost the insert performance.
> It would be nice to have an option to enable pre-splitting table before the inserts begin.
> it would look something like:
> (a) hbase ...PerformanceEvaluation   --presplit=10 <other options>
> (b) hbase ...PerformanceEvaluation   --presplit <other options>
> (b) will try to presplit the table on some default value (say number of region servers)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira