You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Andrei Dragomir (JIRA)" <ji...@apache.org> on 2009/09/29 13:22:15 UTC

[jira] Created: (HBASE-1874) Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster

Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster
-----------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: HBASE-1874
                 URL: https://issues.apache.org/jira/browse/HBASE-1874
             Project: Hadoop HBase
          Issue Type: Improvement
    Affects Versions: 0.20.0
            Reporter: Andrei Dragomir
            Priority: Minor


I have a simple class that instantiates an HBaseAdmin object, and tries to connect to a remote hbase cluster (latency about 300ms). 

When doing any kind of simple operation, like listTables, tableExists, it takes a huge amount of time, because the mechanism is to instantiate a scanner on the client, so for each row in the .META. table, we get a client - cluster roundtrip. This is prohibitive in the case of a far away client. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1874) Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1874:
----------------------------------

    Fix Version/s: 0.21.0
                   0.20.1
           Status: Patch Available  (was: Open)

> Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1874
>                 URL: https://issues.apache.org/jira/browse/HBASE-1874
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Andrei Dragomir
>            Priority: Minor
>             Fix For: 0.20.1, 0.21.0
>
>         Attachments: HBASE-1874.patch
>
>
> I have a simple class that instantiates an HBaseAdmin object, and tries to connect to a remote hbase cluster (latency about 300ms). 
> When doing any kind of simple operation, like listTables, tableExists, it takes a huge amount of time, because the mechanism is to instantiate a scanner on the client, so for each row in the .META. table, we get a client - cluster roundtrip. This is prohibitive in the case of a far away client. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1874) Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster

Posted by "Andrei Dragomir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Dragomir updated HBASE-1874:
-----------------------------------

    Attachment: HBASE-1874.patch

This patch uses the ScannerCallable caching parameter. This way, we only have client / server roundtrips every x rows .The rows are defined in HConstants, and set to 1000 rows. 

> Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1874
>                 URL: https://issues.apache.org/jira/browse/HBASE-1874
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Andrei Dragomir
>            Priority: Minor
>         Attachments: HBASE-1874.patch
>
>
> I have a simple class that instantiates an HBaseAdmin object, and tries to connect to a remote hbase cluster (latency about 300ms). 
> When doing any kind of simple operation, like listTables, tableExists, it takes a huge amount of time, because the mechanism is to instantiate a scanner on the client, so for each row in the .META. table, we get a client - cluster roundtrip. This is prohibitive in the case of a far away client. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-1874) Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-1874:
-------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Committed branch and trunk.

Before committing, I removed the define from HConstants and instead made it an option you can get from HBaseConfiguration called hbase.meta.scanner.caching.  I set it to 100 rather than 1000 (hope you don't mind Andrei my being conservative).

Thanks for the patch Andrei.

> Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1874
>                 URL: https://issues.apache.org/jira/browse/HBASE-1874
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Andrei Dragomir
>            Priority: Minor
>             Fix For: 0.20.1, 0.21.0
>
>         Attachments: HBASE-1874.patch
>
>
> I have a simple class that instantiates an HBaseAdmin object, and tries to connect to a remote hbase cluster (latency about 300ms). 
> When doing any kind of simple operation, like listTables, tableExists, it takes a huge amount of time, because the mechanism is to instantiate a scanner on the client, so for each row in the .META. table, we get a client - cluster roundtrip. This is prohibitive in the case of a far away client. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1874) Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster

Posted by "Andrew Purtell (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760658#action_12760658 ] 

Andrew Purtell commented on HBASE-1874:
---------------------------------------

Makes sense for MetaScanner to use caching in general.

+1


> Client Scanner mechanism that is used for HbaseAdmin methods (listTables, tableExists), is very slow if the client is far away from the HBase cluster
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1874
>                 URL: https://issues.apache.org/jira/browse/HBASE-1874
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Andrei Dragomir
>            Priority: Minor
>         Attachments: HBASE-1874.patch
>
>
> I have a simple class that instantiates an HBaseAdmin object, and tries to connect to a remote hbase cluster (latency about 300ms). 
> When doing any kind of simple operation, like listTables, tableExists, it takes a huge amount of time, because the mechanism is to instantiate a scanner on the client, so for each row in the .META. table, we get a client - cluster roundtrip. This is prohibitive in the case of a far away client. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.