You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Karthick Sankarachary (JIRA)" <ji...@apache.org> on 2011/04/13 22:40:05 UTC

[jira] [Created] (HBASE-3777) Redefine Identity Of HBase Configuration

Redefine Identity Of HBase Configuration
----------------------------------------

                 Key: HBASE-3777
                 URL: https://issues.apache.org/jira/browse/HBASE-3777
             Project: HBase
          Issue Type: Improvement
          Components: client, ipc
    Affects Versions: 0.90.2
            Reporter: Karthick Sankarachary
            Assignee: Karthick Sankarachary
            Priority: Minor
             Fix For: 0.92.0


Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".

Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777-V3.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020088#comment-13020088 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

It turns out HBASE-3708 broke the build.
After getting over that bug, I got:
{code}
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.client.TestHCM
-------------------------------------------------------------------------------
Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 38.137 sec <<< FAILURE!
testManyNewConnectionsDoesnotOOME(org.apache.hadoop.hbase.client.TestHCM)  Time elapsed: 7.349 sec  <<< FAILURE!
java.lang.AssertionError: expected:<31> but was:<1>
        at org.junit.Assert.fail(Assert.java:91)
        at org.junit.Assert.failNotEquals(Assert.java:645)
        at org.junit.Assert.assertEquals(Assert.java:126)
        at org.junit.Assert.assertEquals(Assert.java:470)
        at org.junit.Assert.assertEquals(Assert.java:454)
        at org.apache.hadoop.hbase.client.TestHCM.createNewConfigurations(TestHCM.java:109)
        at org.apache.hadoop.hbase.client.TestHCM.testManyNewConnectionsDoesnotOOME(TestHCM.java:78)
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020912#comment-13020912 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

Ah, speaking of {{HConnectionManager#deleteAllConnections}}, notice that we didn't used to remove the connection from the cache after closing it. That's not good, because someone might end up getting a connection that's been already closed. The V2 version of the patch fixes that by clearing the cache.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024968#comment-13024968 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-23 03:02:04, Ted Yu wrote:
bq.  > src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java, line 147
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16927#file16927line147>
bq.  >
bq.  >     Here we mix user code with test cluster management code.
bq.  >     I think table.close() should be called first in the finally block.

Closing the table before shutting down the cluster seems appropriate.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review532
-----------------------------------------------------------


On 2011-04-22 21:16:59, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-22 21:16:59)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023492#comment-13023492 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review532
-----------------------------------------------------------



src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java
<https://reviews.apache.org/r/643/#comment1108>

    Here we mix user code with test cluster management code.
    I think table.close() should be called first in the finally block.


- Ted


On 2011-04-22 21:16:59, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-22 21:16:59)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023415#comment-13023415 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/
-----------------------------------------------------------

(Updated 2011-04-22 21:16:59.765464)


Review request for hbase and Ted Yu.


Changes
-------

The V6 version of the patch fixes the test failures in V5 by:

a) Adding a package-level CatalogTracker constructor so that TestCatalogTracker can continue to use its Connection mock object.
b) Deleting the connection from HConnectionManager#HBASE_INSTANCES if its finalize method is called.
c) Removing the HTable#finalize method which might cause a closed connection to be returned to the current thread.

There were no test failures with the V6 version of the patch. Please let me know if we need to tweak this further.


Summary
-------

Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".


This addresses bug HBASE-3777.
    https://issues.apache.org/jira/browse/HBASE-3777


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
  src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
  src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
  src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
  src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
  src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 

Diff: https://reviews.apache.org/r/643/diff


Testing
-------

mvn test


Thanks,

Karthick



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027084#comment-13027084 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

{quote}The mapping should really be "cluster uuid" (if such a thing exists) to connection. Perhaps there's a hmaster md5 that can be used in lieu of cluster-uuid sitting in ZK that can be probed?{quote}

The thing is that a {{HConnection}}'s behavior is determined not just by the server-side cluster it goes against, but also its client-side properties, such as "hbase.client.retries.number", "hbase.client.prefetch.limit", and so on. Ergo, we really need a different connection for every unique set of connection-specific config properties, whether it be client- or server-specific.

{quote}Perhaps there's a hmaster md5 that can be used in lieu of cluster-uuid sitting in ZK that can be probed?{quote}
As per the [ZK/HBase use cases|http://wiki.apache.org/hadoop/ZooKeeper/HBaseUseCases] wiki, in theory we can have multiple masters registered with the ZK (to eliminate any SPOFs perhaps?). So, I'm not sure we can presuppose what hmaster we'll be going to at any given point in time.

{quote}Then, an alternative other way is to go ahead and make the extra connection and use it to determine which cluster the client is going against. If it's a previously-seen cluster, close this newly-created connection, and use the stashed one. Else this is a new cluster and create a new mapping entry.{quote}
The whole purpose of this patch was to reduce the number of connections by reusing them to the extent possible. At one point, the config's {{equals}} method was treated as the key to the connection, which promoted reuse to some extent, but started breaking down if the config was changed after the fact. Currently, the config's identity (object reference) is treated as the key, but that suffers from connection overload. Hopefully, the {{HConnectionKey}} defined in the HCM will serve as a happy medium between the two ends of the spectrum.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026724#comment-13026724 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/
-----------------------------------------------------------

(Updated 2011-04-28 22:01:18.165256)


Review request for hbase and Ted Yu.


Changes
-------

The V8 version of the diff addresses Todd's concerns around leaks in the event of exceptions. In short, it wraps all (method-level) blocks that access the connection around the HCM#execute method, which takes care of acquiring and closing the connection. Specifically, exceptions thrown by close will be swallowed if (and only if) the block itself throws one. There were no regressions, AFAIK, although the TestHRegionLocation and TestCatalogTracker tests did fail.


Summary
-------

Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".


This addresses bug HBASE-3777.
    https://issues.apache.org/jira/browse/HBASE-3777


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b 
  src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 
  src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
  src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
  src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java 58c9153 
  src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 64c14df 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d211b53 
  src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 
  src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
  src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 
  src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 126d9af 
  src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e 
  src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
  src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b 
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 

Diff: https://reviews.apache.org/r/643/diff


Testing
-------

mvn test


Thanks,

Karthick



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022802#comment-13022802 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

My bad - I didn't get a chance to test version 5 yet. Will apply your fixes above and keep you posted on my findings. Are we on the right track, in general?

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024977#comment-13024977 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 259
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line259>
bq.  >
bq.  >     Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters?
bq.  
bq.  Karthick Sankarachary wrote:
bq.      Just a thought - how about if we hide the ugliness in HCM, like so:
bq.      
bq.        public abstract class Connectable<T> {
bq.          public Configuration conf;
bq.      
bq.          public Connectable(Configuration conf) {
bq.            this.conf = conf;
bq.          }
bq.      
bq.          public abstract T connect(Connection connection);
bq.        }
bq.      
bq.        public static <T> T execute(Connectable<T> connectable) {
bq.          if (connectable == null || connectable.conf == null) {
bq.            return null;
bq.          }
bq.          HConfiguration conf = connectable.conf;
bq.          HConnection connection = HConnectionManager.getConnection(conf);
bq.          try {
bq.            return connectable.connect(connection);
bq.          } finally {
bq.            HConnectionManager.deleteConnection(conf, false);
bq.          }
bq.        }
bq.      
bq.      That way, the HTable call would look somewhat prettier:
bq.      
bq.        HConnectionManager.execute(new Connectable<Boolean>(conf) {
bq.          public Boolean connect(Connection connection) {
bq.            return connection.isTableEnabled(tableName);
bq.          }
bq.        });

BTW, if we bypass the reference counters in this situation, there's a chance, albeit small, that the connection might get closed by someone else while this guy is still trying to talk to it, which could result in a "connection is closed" type of error.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review543
-----------------------------------------------------------


On 2011-04-22 21:16:59, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-22 21:16:59)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023229#comment-13023229 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

TestZooKeeper failed when I ran the whole test suite.
When I ran the test alone, it passed.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment:     (was: HBASE-3777.patch)

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020108#comment-13020108 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

For immutability (see comment @ 14/Apr/11 20:26), I think we can utilize the following from Guava to represent the conf field in HConnectionKey:
{code}
      ImmutableMap.Builder<String, String> builder = ImmutableMap.builder();
      builder.put("a", "b");
{code}


> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022751#comment-13022751 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review521
-----------------------------------------------------------



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1069>

    deleteConnection() may remove connectionKey and lead to ConcurrentModificationException.
    
    After restoring deleteAllConnections() to that of version 4, TestHBaseTestingUtility passes.


- Ted


On 2011-04-21 07:23:09, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-21 07:23:09)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022554#comment-13022554 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

Ted, 

First off, thanks for keeping me honest. To answer your comments:

{quote}There was one little conflict in HConnection.java where J-D recently put in:{quote}

Resolved.

{quote}We can break out of the loop in HConnectionManager.putConnection() if the reference count reaches 0, right ?{quote}

Yes, we can break out of that loop if we find the connection we're looking for.

{quote}I don't find the following method in HConnectionManager (putConnection) called elsewhere{quote}

That's correct, I'll take it out.

{quote}The following method (i.e. close()) is called only by finalizer. I think we should call it when reference count reaches 0.{quote}

Ideally, the reference count should already be zero by the time {{HConnectionImplementation#finalize}} is called. Now, the fact that that method was invoked implies that all references to that connection have already gone out of scope, so it's not necessary to check our reference count. In fact, on the off chance that someone forgets to close the connection, that reference count will not be zero, and so if we were to check for that, we would not release the connection's resources, even though the JVM says that it is no longer in use.

{quote}Also, in deleteConnection(), the code starting line 221 should be enclosed in synchronized block. All the other accesses to HBASE_INSTANCES are protected.{quote}
I believe you are referring to the {{getConnection}} method. I fixed it so that all accesses to HBASE_INSTANCES are protected.

{quote}Should MAX_CACHED_HBASE_INSTANCES be increased ?{quote}
In theory, the number of connections from a given client to the zookeeper can be changed using the "hbase.zookeeper.property.maxClientCnxns" property. So, it's not clear to me why MAX_CACHED_HBASE_INSTANCES.is even a constant to begin with. I think this topic deserves its own (separate?) issue.

{quote}HTablePool.closeTablePool() no longer calls {{deleteConnection}}:{quote}

I put it back - given the shutdown semantics of {{HTablePool}}

{quote}I think the following test failure is a regression (on Linux) (TestHBaseTestingUtility){quote}

Yes, that's one of the two test cases that failed for me - the log leads me to believe it is unable to delete a certain file, but I'll take a closer look.

Please review the revised patch (version V4), which has also been uploaded to the [review board|https://reviews.apache.org/r/643/].

Regards,
Karthick

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026084#comment-13026084 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review596
-----------------------------------------------------------



src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
<https://reviews.apache.org/r/643/#comment1241>

    try-with-resources in JDK 7 would be useful in our case:
    http://hg.openjdk.java.net/jdk7/tl/jdk/rev/6e33b377aa6e


- Ted


On 2011-04-28 00:13:52, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-28 00:13:52)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf 
bq.    src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023485#comment-13023485 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review531
-----------------------------------------------------------



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1107>

    Shall we break out of the loop here ?


- Ted


On 2011-04-22 21:16:59, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-22 21:16:59)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019630#comment-13019630 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

Allow me to add step 2.5:
apply the implementation from step 2 on existing (and new) unit tests for validation.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021949#comment-13021949 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

HTablePool.closeTablePool() no longer calls this:
{code}
-    HConnectionManager.deleteConnection(this.config, true);
{code}
I think it should be kept because HTablePool.closeTablePool() is "a 'shutdown' of the given table pool" (according to javadoc).

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Status: Patch Available  (was: Open)

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019601#comment-13019601 ] 

Jean-Daniel Cryans commented on HBASE-3777:
-------------------------------------------

This is one of the most important one, that also removed both hashCode and equals from HBaseConfiguration, HBASE-2925.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Yu updated HBASE-3777:
--------------------------

    Attachment: 3777-TOF.patch

This patch changes the way TableOutputFormat closes connection.
Similar change would be applied to mapred.TableOutputFormat

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021934#comment-13021934 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I don't find the following method in HConnectionManager called elsewhere:
{code}
  public static void putConnection(Configuration conf) {
    deleteConnection(conf, false);
  }
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020910#comment-13020910 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

{quote}See Effective Java, 2nd edition, page 31 about using finalizer.{quote}

Point well taken. The only reason I went with the {{finalizer}} approach, which may not be ideal, is that it is safe, in the sense that the connection will eventually get closed, if there are truly no references to it. 

Alternatively, we can take the approach you described (not unlike the one in HBASE-3766), where the reference count is maintained by the connection manager. Specifically, the count could be incremented in {{HConnectionManager#getConnection}} and decremented in {{HConnectionManager#deleteConnection}}. As long as there are references to a connection, it will stay in the cache. When the count drop to zero, then at that point, we go ahead and close it.The catch with this approach is that every connection acquired must be explicitly deleted, otherwise we run the risk of never being able to close it. (currently, there are 33 calls to {{HConnectionManager#getConnection}}, but only 9 calls to {{HConnectionManager#deleteConnection}}). Note that the {{finalizer}} approach does not have this problem, since the JVM keeps track of the connection's reference count for us.

In any case, this patch requires some sort of a reference count mechanism, regardless of how it is implemented.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044258#comment-13044258 ] 

Hudson commented on HBASE-3777:
-------------------------------

Integrated in HBase-TRUNK #1951 (See [https://builds.apache.org/hudson/job/HBase-TRUNK/1951/])
    HBASE-3592 Guava snuck back in as a dependency via hbase-3777


> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022655#comment-13022655 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 417
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16721#file16721line417>
bq.  >
bq.  >     specify that, even if the instance ids are the same, it could result in non-shared Connections if some of the other parameters differ. Right?

I made a note of that (verbatim) in that property's comment.


bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 61
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16722#file16722line61>
bq.  >
bq.  >     why not final?

Its back to being final now. In one of the earlier avatars of the patch, I was relying just on Object#finalize to release the connection, and wanted to set the connection to null in the hopes that the GC will it to it quicker, but we don't need to do that anymore.


bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, lines 147-148
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16722#file16722line147>
bq.  >
bq.  >     assert !stopped, or even Preconditions.checkState(!stopped)

That might be a little more harsh than is warranted. A safer approach would be to make the stop "destructor" method idempotent, which can be accomplished by not doing anything if the object is already "stopped", and that is what I did here for now. That's the way the destructor for most of the other objects (e.g., HTable) are implemented. Please let me know if we should do the assert anyway.


bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 150
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16722#file16722line150>
bq.  >
bq.  >     I don't follow this. It seems like we're releasing a resource we didn't necessarily take ourselves. Spaghetti warning sign.

Ah, this is good question. If you look at all of the references to the CatalogTracker, you'll notice that the Connection instance passed to it is never used outside the context of the CatalogTracker. In other words, the callee creates the Connection from the Configuration instance on behalf of the CatalogTracker. For the sake of consistency, I rewrote the CatalogTracker so that it takes a Configuration instead of a Connection instance. That way, we can be rest assured that the CatalogTracker will only release resources that it itself takes.

The CatalogTracker was a big reason why I had to make so many changes to the HConnectionManager. After rewriting the former, the change to the latter is now relatively minimal.


bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 66
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16723#file16723line66>
bq.  >
bq.  >     why not final anymore?

I made it final again.


bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 176
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line176>
bq.  >
bq.  >     this construction can go outside synch block

Agreed. Also, I rewrote the two delete connection methods so that they reuse the logic shared by them.


bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, lines 1633-1638
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1633>
bq.  >
bq.  >     this should be renamed to decrementAndGetRefCount() to be clear that it returns the post-decrement value.
bq.  >     
bq.  >     Similar for incCount above.

The reference count methods were plagiarised from the HBaseClient class. Not sure why I added the return statement in the increment/decrement methods, but now they're gone.


bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1645
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1645>
bq.  >
bq.  >     when is this method ever safe to use? I think it can be removed

Given the change above, this is the only way now for us to tell if a connection can be released.


bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1652
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1652>
bq.  >
bq.  >     this logic seems wrong, because the finalizer will get called even if the thing is already closed, then ref count will get decremented below 0.

Actually, the finalize method does not check the reference count, nor does it need to, before it tries to close the connection. The assumption I made before was that the close(stopProxy) method was idempotent, but that wasn't completely true (it is possible that we might try to stop the region servers twice). To address that, I check if the connection is already closed or not before trying to release it, as Ted suggests in his comment below.


bq.  On 2011-04-21 00:36:11, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1315
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16726#file16726line1315>
bq.  >
bq.  >     all of these finally blocks should instead use something like IOUtils.cleanUp -- and HConnection should implement Closeable. This way if there's some exception, it doesn't mask a prior exception inside the actual try {...}

Good point. I replaced connection.close() with HConnectionManager.deleteConnection (which is kind of like the IOUtils.cleanUp method you wanted). As a result, we don't need HConnection to implement Closeable anymore.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review512
-----------------------------------------------------------


On 2011-04-20 23:56:13, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-20 23:56:13)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116575#comment-13116575 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

Test suite didn't go very far - TestLogRolling hangs
{code}
"main" prio=10 tid=0x0000000057197000 nid=0x3f12 waiting on condition [0x00000000406c8000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.hbase.client.HBaseAdmin.deleteTable(HBaseAdmin.java:446)
        at org.apache.hadoop.hbase.client.HBaseAdmin.deleteTable(HBaseAdmin.java:393)
        at org.apache.hadoop.hbase.regionserver.wal.TestLogRolling.testLogRollOnPipelineRestart(TestLogRolling.java:415)
{code}
                
> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777-V8.0.90.4.backport.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019619#comment-13019619 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

I see. In that case, using a combination of {{conf.get("hbase.zookeeper.quorum")}} and {{conf.get("hbase.client.uniqueid")}} as the key, like Ted suggested, may be the way to go.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-3777:
-------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Committed to TRUNK after trying w/ 500 ycsb clients (It comes up and runs rather than pre-patch it fails).

Thank you for your persistence Karthick (and to the reviewers).

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Bright Fulton (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bright Fulton updated HBASE-3777:
---------------------------------

    Attachment: HBASE-3777-V8.0.90.4.backport.patch

Attached backport of fix to 0.90.4.

                
> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777-V8.0.90.4.backport.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020196#comment-13020196 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

Latest version of patch looks good.
Next we need to pass all unit tests.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025945#comment-13025945 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review586
-----------------------------------------------------------


Can you include TableOutputFormat.java in the patch please ?

- Ted


On 2011-04-27 18:33:01, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-27 18:33:01)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022627#comment-13022627 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

TestHBaseTestingUtility.multiClusters() should be improved with a catch block.
Previously we faced timeout exception when in fact the cause was TableExistsException.
I propose adding the following before the finally block:
{code}
    } catch (Exception e) {
      LOG.error("multiClusters failed: ", e);
    }
{code}

BTW, TestHBaseTestingUtility and TestReplication both passed with the above addition of HConstants.ZOOKEEPER_ZNODE_PARENT

We should document HConnectionKey.CONNECTION_PROPERTIES so that developers know where to add new uniquifiers.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019592#comment-13019592 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

J-D informed me that my initial proposal mirrors what used to be done in 0.89
The current design is to bypass certain issues encountered by 0.89

Shall we do the following ?
Step 1, agree upon mechanism for determining identity of HBaseConfiguration's and reference counting. Enumerate the possibilities of error from experience of 0.89 development.
Step 2, implement the new mechanism in trunk.
Step 3, thoroughly test (YCSB, etc) before publishing.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026001#comment-13026001 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-27 01:33:01, Jean-Daniel Cryans wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 360
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16917#file16917line360>
bq.  >
bq.  >     Same comment as in HRS, I think this is creating a second connection for the master.

Same comment as in HRS. Again, please correct me if I'm wrong.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review569
-----------------------------------------------------------


On 2011-04-27 18:33:01, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-27 18:33:01)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026056#comment-13026056 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/
-----------------------------------------------------------

(Updated 2011-04-28 00:13:52.902673)


Review request for hbase and Ted Yu.


Changes
-------

The V7 version of the patch make the following additional changes:

a) Adds a HCM#execute method for executing blocks that require short-lived connections.
b) Removes the HCM#deleteConnection from the HMaster and HRegionServer classes, as they no longer directly get connections.
c) Adds a connection field in the ServerManager class, which is gotten in its constructor and deleted when it's stopped.

All but two tests (viz., TestSplitLogWorker and TestCatalogJanitor) passed. FWIW, those two failures happen without the patch as well, and only if the "hbase.master.distributed.log.splitting" is true.

PS: Just heard from Ted Yu that Stack checked in a patch for HBASE-1502, which I'll rebase with and test tomorrow. In the meantime, please review the three critical changes described above.


Summary
-------

Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".


This addresses bug HBASE-3777.
    https://issues.apache.org/jira/browse/HBASE-3777


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
  src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
  src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
  src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf 
  src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
  src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
  src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
  src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
  src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6 
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 

Diff: https://reviews.apache.org/r/643/diff


Testing
-------

mvn test


Thanks,

Karthick



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022662#comment-13022662 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

Ted,

{quote}BTW, TestHBaseTestingUtility and TestReplication both passed with the above addition of HConstants.ZOOKEEPER_ZNODE_PARENT{quote}

Excellent catch. Added the missing property to the connection key.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025948#comment-13025948 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-27 18:43:04, Ted Yu wrote:
bq.  > Can you include TableOutputFormat.java in the patch please ?

Will do.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review586
-----------------------------------------------------------


On 2011-04-27 18:33:01, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-27 18:33:01)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024938#comment-13024938 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review543
-----------------------------------------------------------


Good work lads (Karthick and Ted reviewing).  Small nitpicks below.  Lets get this in if all tests pass.


src/main/java/org/apache/hadoop/hbase/HConstants.java
<https://reviews.apache.org/r/643/#comment1136>

    Copy/paste issue (minor)



src/main/java/org/apache/hadoop/hbase/HConstants.java
<https://reviews.apache.org/r/643/#comment1137>

    Thanks for moving these configs. in here.



src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
<https://reviews.apache.org/r/643/#comment1141>

    This looks like a good change.



src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
<https://reviews.apache.org/r/643/#comment1143>

    Implement Closeable now you've added close?



src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
<https://reviews.apache.org/r/643/#comment1142>

    Good



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1146>

    This is painful, but makes sense.



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1147>

    Not important but if closed, just return immediately and then you can save indenting whole method.  Not important.  Just style diff.



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1148>

    So, this is just insurance as you say in the issue.  Thats fine I'd say (I agree w/ Ted that we shouldn't rely on finalize)



src/main/java/org/apache/hadoop/hbase/client/HTable.java
<https://reviews.apache.org/r/643/#comment1149>

    Good.



src/main/java/org/apache/hadoop/hbase/client/HTable.java
<https://reviews.apache.org/r/643/#comment1150>

    Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters?



src/main/java/org/apache/hadoop/hbase/client/HTable.java
<https://reviews.apache.org/r/643/#comment1151>

    Just remove this.



src/main/java/org/apache/hadoop/hbase/client/HTablePool.java
<https://reviews.apache.org/r/643/#comment1152>

    Just remove.



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
<https://reviews.apache.org/r/643/#comment1153>

    Interesting but I go along w/ it.  Looks like we only made this connection for CT?  If so, bad design fixed by your CT change.



src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
<https://reviews.apache.org/r/643/#comment1154>

    ditto


- Michael


On 2011-04-22 21:16:59, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-22 21:16:59)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020025#comment-13020025 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I think the following check in equals() can be relaxed a little:
{code}
+          if (thisValue == null || thatValue == null
+              || !thisValue.equals(thatValue)) {
{code}
the clause (thatValue == null) can be omitted.

I agree that reference counting can be handled in HBASE-3766.

In the future, please use https://reviews.apache.org/ for review.

Running the new test, I got:
{code}
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.client.TestConnectionManager
-------------------------------------------------------------------------------
Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 28.505 sec <<< FAILURE!
testConnectionSameness(org.apache.hadoop.hbase.client.TestConnectionManager)  Time elapsed: 16.789 sec  <<< ERROR!
org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information.
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:159)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1076)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:372)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:363)
        at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156)
        at org.apache.hadoop.hbase.client.TestConnectionManager.testConnectionSameness(TestConnectionManager.java:76)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
...
Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
        at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809)
        at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:837)
        at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902)
        at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:137)
        ... 31 more
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019570#comment-13019570 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

Ted, 

I saw your comment on HBASE-3734. It:

a) Proposes a neater way of comparing {{Configuration}} instances, for the purposes of {{HConnection}} lookup. In fact, the thought of comparing just the cluster-specific properties in {{HBaseConfiguration}} did cross my mind. However, at times, you may want the ability to have multiple connections per cluster, which would not be possible using your approach. 

b) Validates the need for having a reference count on the connection. Instead of using a (refcount, connection) tuple as the value in HBASE_INSTANCES though, HBASE-3766 puts the refcount in the connection itself. Do you see a specific advantage to separating out the refcount from the connection?

Regards,
Karthick



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027860#comment-13027860 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/
-----------------------------------------------------------

(Updated 2011-05-02 21:34:35.203784)


Review request for hbase and Ted Yu.


Changes
-------

As Ted suggsted, added "a log statement for the case where connectSucceeded is false."


Summary
-------

Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".


This addresses bug HBASE-3777.
    https://issues.apache.org/jira/browse/HBASE-3777


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b 
  src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 
  src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
  src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
  src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411 
  src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275 
  src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 
  src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
  src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 
  src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb 
  src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e 
  src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
  src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b 
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 

Diff: https://reviews.apache.org/r/643/diff


Testing
-------

mvn test


Thanks,

Karthick



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025543#comment-13025543 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review569
-----------------------------------------------------------



src/main/java/org/apache/hadoop/hbase/master/HMaster.java
<https://reviews.apache.org/r/643/#comment1213>

    Same comment as in HRS, I think this is creating a second connection for the master.



src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
<https://reviews.apache.org/r/643/#comment1212>

    IIUC, we are creating an additional connection here since CT will do a getConnection with the passed conf instead of using a connection that the RS already has.


- Jean-Daniel


On 2011-04-22 21:16:59, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-22 21:16:59)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022773#comment-13022773 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

Got a new test failure:
{code}
Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 194.698 sec <<< FAILURE!
testMultiRegionTable(org.apache.hadoop.hbase.mapreduce.TestTableMapReduce)  Time elapsed: 176.014 sec  <<< ERROR!
java.lang.RuntimeException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server null for region , row 'hhh', but failed after 10 attempts.
Exceptions:
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
java.io.IOException: org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@6c5ed253 closed
        at org.apache.hadoop.hbase.client.HTable$ClientScanner$1.hasNext(HTable.java:1228)
        at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.verifyAttempt(TestTableMapReduce.java:189)
        at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.verify(TestTableMapReduce.java:158)
        at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.runTestOnTable(TestTableMapReduce.java:140)
        at org.apache.hadoop.hbase.mapreduce.TestTableMapReduce.testMultiRegionTable(TestTableMapReduce.java:114)
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021942#comment-13021942 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

Should MAX_CACHED_HBASE_INSTANCES be increased ?

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020906#comment-13020906 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

Please consider running the tests on a Linux box, such as:
Linux x-grid07.ciq.com 2.6.18-194.8.1.el5 #1 SMP Thu Jul 1 19:04:48 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

None of the tests above failed on the above machine. This was the only credible failure I saw:
{code}
Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 229.043 sec <<< FAILURE!
testMasterFailoverWithMockedRIT(org.apache.hadoop.hbase.master.TestMasterFailover)  Time elapsed: 180.039 sec  <<< ERROR!
java.lang.Exception: test timed out after 180000 milliseconds
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.hbase.MiniHBaseCluster.waitForActiveAndReadyMaster(MiniHBaseCluster.java:478)
        at org.apache.hadoop.hbase.master.TestMasterFailover.testMasterFailoverWithMockedRIT(TestMasterFailover.java:429)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030228#comment-13030228 ] 

Hudson commented on HBASE-3777:
-------------------------------

Integrated in HBase-TRUNK #1909 (See [https://builds.apache.org/hudson/job/HBase-TRUNK/1909/])
    

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020897#comment-13020897 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

See Effective Java, 2nd edition, page 31 about using finalizer.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020118#comment-13020118 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

I changed the assertion in {{TestHCM#testManyNewConneciontsDoesnotOOME}}, moved my test cases (viz., {{testConnectionSameness}} and {{testConnectionUniqueness}}) there, and verified that it passes.

About making the {{HConnectionKey#conf}} immutable, the problem is that the user (client) will still have a reference to the "mutable" instance of the {{Configuration}}, so there's really no way to enforce its immutability, short of marking the connection properties as "final" in the hbase-default.xml (but then again, the user can choose to not make it final). In any case, it's not the end of the world if a connection property were to change after the fact, because the initial entry for that {{Configuration}} will eventually get evicted, but not before deleting the {{HConnection}}, so we should be okay.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020851#comment-13020851 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I am running tests based on HBASE-3777-V2.patch
Can you disclose which five tests failed ?

Good job.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020905#comment-13020905 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

{quote}Can you disclose which five tests failed ?{quote}

The failures, mostly confined to the HDFS layer, include:

- org.apache.hadoop.hbase.master.TestHMasterRPCException#testRPCException(TestHMasterRPCException.java:57) - "100 millis timeout while waiting for channel to be ready for read"
- org.apache.hadoop.hbase.replication.TestReplication#setUp(TestReplication.java:174) - "Waited too much time for truncate"
- org.apache.hadoop.hbase.TestHBaseTestingUtility#testMiniZooKeeper(TestHBaseTestingUtility.java:142) - "Unable to create data directory"
- org.apache.hadoop.hbase.coprocessor.TestRegionObserverStacking#testRegionObserverStacking(TestRegionObserverStacking.java:112) - "Cannot get log writer"
- org.apache.hadoop.hbase.client.TestGetRowVersions#testGetRowMultipleVersions(TestGetRowVersions.java:67) - "CRC check failed"


> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024967#comment-13024967 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 116
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16908#file16908line116>
bq.  >
bq.  >     Copy/paste issue (minor)

Will change it to "Default limit on concurrent client-side zookeeper connections".


bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/HConstants.java, line 442
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16908#file16908line442>
bq.  >
bq.  >     Thanks for moving these configs. in here.

Yeah, the HConnectionKey would not have looked pretty if we hadn't moves those configs to HConstants. 

I will remove the trailing space. 


bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java, line 177
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16909#file16909line177>
bq.  >
bq.  >     This looks like a good change.

As a matter of fact, the CatalogTracker was the only class that was being handed a connection, which made cleanup tricky since it didn't really own that connection (as Todd rightly pointed out). Making it take a configuration seemed like the most pragmatic thing to do.


bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java, line 63
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16910#file16910line63>
bq.  >
bq.  >     Implement Closeable now you've added close?

Yes, we can. I'll make HConnection implement Closeable as well. If you want, we can make HTablePool implement Closeable by calling closeTablePool on all of its tables.


bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 265
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line265>
bq.  >
bq.  >     This is painful, but makes sense.

A small price to pay, in my opinion.


bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1207
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line1207>
bq.  >
bq.  >     Not important but if closed, just return immediately and then you can save indenting whole method.  Not important.  Just style diff.

Will do.


bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1667
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line1667>
bq.  >
bq.  >     So, this is just insurance as you say in the issue.  Thats fine I'd say (I agree w/ Ted that we shouldn't rely on finalize)

Exactly - it's just insurance, a fall-back in case some thread somewhere was unable to close the connection for whatever reason.


bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/master/HMaster.java, line 355
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16917#file16917line355>
bq.  >
bq.  >     Interesting but I go along w/ it.  Looks like we only made this connection for CT?  If so, bad design fixed by your CT change.

Yes, for the most part, the connection that was being given to CT was not used for anything else. There was one exception though (TestCatalogTracker), which was doing all kinds of things on the connection outside of the CT, and to accomodate that, I left open a package-level constructor in CT that is visible only by that test case (it'd be too much trouble to change it).


bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HTablePool.java, line 150
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16913#file16913line150>
bq.  >
bq.  >     Just remove.

Ok, will remove all dead code.


bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 259
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line259>
bq.  >
bq.  >     Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters?

Just a thought - how about if we hide the ugliness in HCM, like so:

  public abstract class Connectable<T> {
    public Configuration conf;

    public Connectable(Configuration conf) {
      this.conf = conf;
    }

    public abstract T connect(Connection connection);
  }

  public static <T> T execute(Connectable<T> connectable) {
    if (connectable == null || connectable.conf == null) {
      return null;
    }
    HConfiguration conf = connectable.conf;
    HConnection connection = HConnectionManager.getConnection(conf);
    try {
      return connectable.connect(connection);
    } finally {
      HConnectionManager.deleteConnection(conf, false);
    }
  }

That way, the HTable call would look somewhat prettier:

  HConnectionManager.execute(new Connectable<Boolean>(conf) {
    public Boolean connect(Connection connection) {
      return connection.isTableEnabled(tableName);
    }
  });


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review543
-----------------------------------------------------------


On 2011-04-22 21:16:59, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-22 21:16:59)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020006#comment-13020006 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

About HBASE-2925, I'm not convinced that the root cause of the memory leak was due to the way {{HBaseConfiguration#equals}} was implemented. Just because the hash code of the {{Configuration}} instance changes by virtue of adding a new property, doesn't mean that it will go away from the LRU chain. Judging by the {{LinkedHashMap#addEntry}} method shown below, if a key is not accessed frequently enough, then it will get evicted no matter what.

{code}
    void addEntry(int hash, K key, V value, int bucketIndex) {
        createEntry(hash, key, value, bucketIndex);

        // Remove eldest entry if instructed, else grow capacity if appropriate
        Entry<K,V> eldest = header.after;
        if (removeEldestEntry(eldest)) {
            removeEntryForKey(eldest.key);
        } else {
            if (size >= threshold)
                resize(2 * table.length);
        }
    }
{code}

I suspect that the memory leak may have been caused by not deleting a {{HConnection}} that gets evicted, for which I suggest the following workaround:
{code}
    protected boolean removeEldestEntry(
        Map.Entry<HConnectionKey, HConnectionImplementation> eldest) {
      boolean remove = size() > MAX_CACHED_HBASE_INSTANCES;
      if (remove) {
        deleteConnection(eldest.getKey().conf, false);
      }
      return remove;
    }
{code}

To address the {{Configuration}} identity issue (crisis?), I introduced the notion of a {{HConnectionKey}} which considers the connection-specific properties for the sake of checking equality and hash code, and rewrote {{HConnectionManager#HBASE_INSTANCES}} in terms of that. Further, there's a new property called HConstants.HBASE_CLIENT_INSTANCE_ID, which if not-null, can be used to uniquely identify its connection key.

Can you please take a look at the latest patch to see if we're on the right track? Note that I've deliberately kept the reference count changes out of this patch since it's not absolutely required here. I still feel that adding the reference count to the {{HConnection}} interface makes more sense, since the {{HConnectionManager}} has no idea when to change it, only its consumers (i.e. {{HTable}}) do. Given that, is it okay if we talk about reference counting in HBASE-3766?

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022660#comment-13022660 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/
-----------------------------------------------------------

(Updated 2011-04-21 07:23:09.807767)


Review request for hbase and Ted Yu.


Changes
-------

This patch incorporates Ted's and Tod's comments on the earlier version.


Summary
-------

Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".


This addresses bug HBASE-3777.
    https://issues.apache.org/jira/browse/HBASE-3777


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
  src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
  src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
  src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
  src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
  src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
  src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 

Diff: https://reviews.apache.org/r/643/diff


Testing
-------

mvn test


Thanks,

Karthick



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "M. C. Srivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026821#comment-13026821 ] 

M. C. Srivas commented on HBASE-3777:
-------------------------------------

It seems error-prone to compare conf's to identify clusters.

The mapping should really be "cluster uuid" (if such a thing exists) to connection. Perhaps there's a hmaster md5 that can be used in lieu of cluster-uuid sitting in ZK that can be probed?

Then, an alternative other way is to go ahead and make the extra connection and use it to determine which cluster the client is going against. If it's a previously-seen cluster, close this newly-created connection, and use the stashed one. Else this is a new cluster and create a new mapping entry.




> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019550#comment-13019550 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I think this JIRA and HBASE-3766 combined can be expressed by my comment on HBASE-3734 at 05/Apr/11 05:20

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022657#comment-13022657 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-21 00:12:09, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 129
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line129>
bq.  >
bq.  >     I agree that value for "hbase.zookeeper.property.maxClientCnxns" property should be used here.
bq.  >     
bq.  >     It's Okay to do that in another JIRA. It's your call.

Here's what I did:

    MAX_CACHED_HBASE_INSTANCES = HBaseConfiguration.create().getInt(
        HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS,
        HConstants.DEFAULT_ZOOKEPER_MAX_CLIENT_CNXNS) + 1;

The assumption here is that the value for "hbase.zookeeper.property.maxClientCnxns" will be the same across all of the configuration instances, which typically is the case.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review511
-----------------------------------------------------------


On 2011-04-20 23:56:13, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-20 23:56:13)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020016#comment-13020016 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

Note that even with the {{HConnectionKey}} model, it is possible to change the identity of the {{Configuration}} by say modifying its HConstants.ZOOKEEPER_QUORUM. That said, I believe all of the connection-specific properties defined in {{HConnectionKey}} should ideally be treated as being "final" by the developer.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027832#comment-13027832 ] 

stack commented on HBASE-3777:
------------------------------

I took a look at the posted patch.  I'm thinking we should commit it as is (Any objections?  I can address Ted Yu's last comment on commit).  Unfortunately, it won't do for 0.90.x since its now polluted with TRUNKisms -- i.e. ServerName -- but thats probably ok since this is a big change.   Let me try this patch out on a cluster in the meantime to make sure it basically works.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021937#comment-13021937 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

The following method is called only by finalizer:
{code}
    void close(boolean stopProxy) {
{code}
I think we should call it when reference count reaches 0.

Also, in deleteConnection(), the code starting line 221 should be enclosed in synchronized block. All the other accesses to HBASE_INSTANCES are protected.


> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022572#comment-13022572 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review512
-----------------------------------------------------------



src/main/java/org/apache/hadoop/hbase/HConstants.java
<https://reviews.apache.org/r/643/#comment1034>

    specify that, even if the instance ids are the same, it could result in non-shared Connections if some of the other parameters differ. Right?



src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
<https://reviews.apache.org/r/643/#comment1035>

    why not final?



src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
<https://reviews.apache.org/r/643/#comment1043>

    assert !stopped, or even Preconditions.checkState(!stopped)



src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java
<https://reviews.apache.org/r/643/#comment1036>

    I don't follow this. It seems like we're releasing a resource we didn't necessarily take ourselves. Spaghetti warning sign.



src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java
<https://reviews.apache.org/r/643/#comment1037>

    why not final anymore?



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1038>

    this construction can go outside synch block



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1039>

    this should be renamed to decrementAndGetRefCount() to be clear that it returns the post-decrement value.
    
    Similar for incCount above.



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1040>

    when is this method ever safe to use? I think it can be removed



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1041>

    this logic seems wrong, because the finalizer will get called even if the thing is already closed, then ref count will get decremented below 0.



src/main/java/org/apache/hadoop/hbase/client/HTable.java
<https://reviews.apache.org/r/643/#comment1042>

    all of these finally blocks should instead use something like IOUtils.cleanUp -- and HConnection should implement Closeable. This way if there's some exception, it doesn't mask a prior exception inside the actual try {...}


- Todd


On 2011-04-20 23:56:13, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-20 23:56:13)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022560#comment-13022560 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review511
-----------------------------------------------------------



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1033>

    I agree that value for "hbase.zookeeper.property.maxClientCnxns" property should be used here.
    
    It's Okay to do that in another JIRA. It's your call.


- Ted


On 2011-04-20 23:56:13, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-20 23:56:13)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027813#comment-13027813 ] 

stack commented on HBASE-3777:
------------------------------

Just FYI, we have cluster uuid as of HBASE-3677.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021952#comment-13021952 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I think the following test failure is a regression (on Linux):
{code}
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.TestHBaseTestingUtility
-------------------------------------------------------------------------------
Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 191.418 sec <<< FAILURE!
multiClusters(org.apache.hadoop.hbase.TestHBaseTestingUtility)  Time elapsed: 180.095 sec  <<< ERROR!
java.lang.Exception: test timed out after 180000 milliseconds
        at java.lang.Object.wait(Native Method)
        at java.lang.Thread.join(Thread.java:1186)
        at java.lang.Thread.join(Thread.java:1239)
        at org.apache.hadoop.hbase.LocalHBaseCluster.join(LocalHBaseCluster.java:407)
        at org.apache.hadoop.hbase.MiniHBaseCluster.join(MiniHBaseCluster.java:501)
        at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:457)
        at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:431)
        at org.apache.hadoop.hbase.TestHBaseTestingUtility.multiClusters(TestHBaseTestingUtility.java:126)
{code}


> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022624#comment-13022624 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I meant CONNECTION_PROPERTIES for HConnectionKey.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020134#comment-13020134 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

{quote}Suppose client A comes with config and places an entry in HBASE_INSTANCES. Then client B comes with config, the previous entry would be returned to client B.
Now client A modifies one of the connection-specific properties - resulting in a change of the HConnectionKey for client B.{quote}

Ah, I see. By freezing the config in the key, we ensure that subsequent changes to the config will only affect the client that is making that change. I will update the patch to do that momentarily. 

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021944#comment-13021944 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

HTable.close() throws IOException, so this.connection.close() should be enclosed in finally block:
{code}
      try {
        flushCommits();
        this.pool.shutdown();
      } finally {
        this.connection.close();
      }
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020848#comment-13020848 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

Please review the updated version (V2) of the patch, which:

# Adds the following properties to the connection key:
#* The zookeeper client port, which is pulled in by {{ZKConfig#makeZKProps}}.
#* The recoverable zookeeper wait time, which is pulled in by the {{ZooKeeperWatcher}}.
# Closes the {{HConnection}} only if there are no strong references to it. This is necessitated by the fact that they can now potentially be shared by multiple clients and configurations. Note that it relies on the garbage collector to clean up the connection, which I feel is a safer approach. Alternatively, we can have the HCM implement a reference counting mechanism, but that would call for a strict clean up strategy.
# As far as testing is concerned, all but five tests passed. FWIW, those failures occur even without the patch, so in that sense, no regressions were found.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020031#comment-13020031 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I am not sure if the following is related.
TestMultipleTimestamps hangs:
{code}
"main" prio=5 tid=103000800 nid=0x100601000 waiting on condition [1005ff000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:196)
        at org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:420)
        at org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:280)
        at org.apache.hadoop.hbase.MiniHBaseCluster.<init>(MiniHBaseCluster.java:79)
        at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:387)
        at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:367)
        at org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:323)
        at org.apache.hadoop.hbase.client.TestMultipleTimestamps.setUpBeforeClass(TestMultipleTimestamps.java:54)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{code}
I will remove changes from HBASE-1364 on my computer and try again.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022568#comment-13022568 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

Since the close method below doesn't show up in diff on review board, I want to comment here:
{code}
    void close(boolean stopProxy) {
{code}
What I meant was that the original call in deleteConnection(Configuration conf, boolean stopProxy):
{code}
      if (t != null) {
        t.close(stopProxy);
      }
{code}
can be used when reference count reaches zero. So we would have:
{code}
  public static void deleteConnection(Configuration conf, boolean stopProxy) {
    synchronized (HBASE_INSTANCES) {
      HConnectionKey connectionKey = new HConnectionKey(conf);
      HConnectionImplementation connection = HBASE_INSTANCES
          .get(connectionKey);
      if (connection != null) {
        if (connection.decRef() == 0) {
          HBASE_INSTANCES.remove(connectionKey);
          connection.close(stopProxy);
        } else if (stopProxy) {
          connection.stopProxyOnClose(stopProxy);
        }
      }
    }
  }
{code}


> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022553#comment-13022553 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/
-----------------------------------------------------------

Review request for hbase and Ted Yu.


Summary
-------

Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".


This addresses bug HBASE-3777.
    https://issues.apache.org/jira/browse/HBASE-3777


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
  src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
  src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
  src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
  src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 

Diff: https://reviews.apache.org/r/643/diff


Testing
-------

mvn test


Thanks,

Karthick



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022580#comment-13022580 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review513
-----------------------------------------------------------



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1044>

    I think close(boolean stopProxy) should check this.closed at the beginning.


- Ted


On 2011-04-20 23:56:13, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-20 23:56:13)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116518#comment-13116518 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

@Bright:
Do all tests in 0.90 pass ?

I got the following when applying your patch:
{code}
Hunk #11 succeeded at 1416 (offset 8 lines).
1 out of 11 hunks FAILED -- saving rejects to file src/main/java/org/apache/hadoop/hbase/client/HTable.java.rej
{code}
This is minor.

Running test suite.
                
> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777-V8.0.90.4.backport.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777-V2.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777-V6.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "M. C. Srivas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027281#comment-13027281 ] 

M. C. Srivas commented on HBASE-3777:
-------------------------------------

bq. The thing is that a HConnection's behavior is determined not just by the server-side cluster it goes against, but also its client-side properties, such as "hbase.client.retries.number", "hbase.client.prefetch.limit", and so on. Ergo, we really need a different connection for every unique set of connection-specific config properties, whether it be client- or server-specific.

I am beginning to understand the reasons behind taking this approach. Thanks for explaining.

bq. As per the ZK/HBase use cases wiki, in theory we can have multiple masters registered with the ZK (to eliminate any SPOFs perhaps?). So, I'm not sure we can presuppose what hmaster we'll be going to at any given point in time.

Even in the presence of multiple hmasters, does it really matter if we connect back to the same hmaster? It probably is important for the hmasters themselves which hmaster they connect to (and perhaps for region-servers as well). But it should not matter for clients. Agree?  (of course, I am stating all this without knowing any details about Hbase, so don't kill me for it).

bq. The whole purpose of this patch was to reduce the number of connections by reusing them to the extent possible. At one point, the config's equals method was treated as the key to the connection, which promoted reuse to some extent, but started breaking down if the config was changed after the fact. Currently, the config's identity (object reference) is treated as the key, but that suffers from connection overload. Hopefully, the HConnectionKey defined in the HCM will serve as a happy medium between the two ends of the spectrum.


Ted Yu pointed out the work being done here, so I started reading the JIRA. I am not familiar with where/how the HConnection instance gets used, and this JIRA was pretty long to understand with the code changes and all.

I started to comment on this Jira due to the problems we faced trying to scale up the YCSB benchmark. We tried to run about 500 threads in the YCSB HBase client, and ran out of connections to ZK. It was a complete, unexpected, surprise that the HBase client needed to maintain multiple connections to ZK, and it seemed to be using one per thread (ie, per HTable).

We share the same goal: with this patch, we hope to be able to scale YCSB to 50 client machines, with 500 threads per client, and see how HBase holds up.

Would you agree, that in the long run, the HBase client should use ZK only to find the hmaster and region-servers, but not keep the connection to ZK open? Otherwise ZK may go under as we try to scale the number of HBase clients.


> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020035#comment-13020035 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

TestConnectionManager passed.
But TestMultipleTimestamps still hangs.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch, HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020097#comment-13020097 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I think the assertion should be changed.
Please run through TestHCM and see if other assertion(s) needs to be changed.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026730#comment-13026730 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review607
-----------------------------------------------------------



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1250>

    A log statement should be added for the case where connectSucceeded is false.


- Ted


On 2011-04-28 22:01:18, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-28 22:01:18)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 58c9153 
bq.    src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 64c14df 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d211b53 
bq.    src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 
bq.    src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 126d9af 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022742#comment-13022742 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

For version 5, I got:
{code}
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.TestHBaseTestingUtility
-------------------------------------------------------------------------------
Tests run: 6, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 50.514 sec <<< FAILURE!
multiClusters(org.apache.hadoop.hbase.TestHBaseTestingUtility)  Time elapsed: 36.058 sec  <<< ERROR!
java.util.ConcurrentModificationException
        at java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:373)
        at java.util.LinkedHashMap$KeyIterator.next(LinkedHashMap.java:384)
        at org.apache.hadoop.hbase.client.HConnectionManager.deleteAllConnections(HConnectionManager.java:219)
        at org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:512)
        at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:455)
        at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:431)
        at org.apache.hadoop.hbase.TestHBaseTestingUtility.multiClusters(TestHBaseTestingUtility.java:128)
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022812#comment-13022812 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

The exception from TestCatalogTracker was due to the following mock:
{code}
    final CatalogTracker ct = constructAndStartCatalogTracker(Mockito
        .mock(Configuration.class));
{code}
which leads to serverClassName being null:
{code}
      String serverClassName = conf.get(HConstants.REGION_SERVER_CLASS,
        HConstants.DEFAULT_REGION_SERVER_CLASS);
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777-V4.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027839#comment-13027839 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

bq. I took a look at the posted patch. I'm thinking we should commit it as is (Any objections? I can address Ted Yu's last comment on commit)

Just an update, I ran the test today after rebasing it (yet again), and this time there were no failures period. I'll update the patch on the review board, so you don't have to rebase it.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022656#comment-13022656 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-21 00:47:22, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 1652
bq.  > <https://reviews.apache.org/r/643/diff/1/?file=16725#file16725line1652>
bq.  >
bq.  >     I think close(boolean stopProxy) should check this.closed at the beginning.

Yes, that's a good practice in general. In fact, the existing implementation wasn't completely idempotent, as explained above.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review513
-----------------------------------------------------------


On 2011-04-20 23:56:13, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-20 23:56:13)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 2bb4725 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026038#comment-13026038 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review591
-----------------------------------------------------------



src/main/java/org/apache/hadoop/hbase/client/HTable.java
<https://reviews.apache.org/r/643/#comment1230>

    We should place deleteConnection() call in finally block because isTableEnabled() may throw IOException:
    
      public boolean isTableEnabled(byte[] tableName) throws IOException;
    
    Looks like we should use Connectable that Karthick proposed.


- Ted


On 2011-04-27 18:33:01, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-27 18:33:01)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020093#comment-13020093 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

I believe that the above {{TestHCM#testManyNewConnectionsDoesnotOOME}} failure is to be expected, given that the patch maps {{Configuration}} to {{HConnection}} based on the connection-specific properties. In other words, no matter how many times you try to get a connection using {{Configuration}} instances that differ only in their non-connection-specific properties, you should see but one {{HConnection}} instance in the HCM's cache. Shall I change that assertion to check for <1> instead of <31>?

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024969#comment-13024969 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-23 02:14:11, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 228
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line228>
bq.  >
bq.  >     Shall we break out of the loop here ?

Will do.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review531
-----------------------------------------------------------


On 2011-04-22 21:16:59, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-22 21:16:59)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026059#comment-13026059 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review593
-----------------------------------------------------------


throughout patch, many cases where you need the returning of a conn to the pool to be in a finally {} clause to avoid leaks when exceptions are thrown


src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1232>

    hashcode -> TableServers is not quite right. It's not a map of hashcode - that implies that two confs that happened to hash to the same code would share an HConnectionImpl, which isn't true.



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1233>

    This increment has to be inside the synchronized (HBASE_INSTANCES) or else there's a race against deleteConnection.



src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java
<https://reviews.apache.org/r/643/#comment1234>

    this atomicinteger isn't ever used atomically (it's always under a lock) so it could just be an int.



src/main/java/org/apache/hadoop/hbase/client/HTable.java
<https://reviews.apache.org/r/643/#comment1235>

    these types of functions should be in try...finally. Otherwise getRegionCachePrefetch("table that does not exist") would leak a connection.
    
    Ideally we would haven implementation of HConnection called HConnectionRef which implements Closeable, so it would be the standard "get an object, use it, close it" type pattern. Calling deleteConnection just feels wrong.



src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java
<https://reviews.apache.org/r/643/#comment1236>

    again try..finally


- Todd


On 2011-04-28 00:13:52, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-28 00:13:52)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf 
bq.    src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022573#comment-13022573 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

I concur on the above comment. I'll update the patch after you're done with the review of this version.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021932#comment-13021932 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

We can break out of the loop in HConnectionManager.putConnection() if the reference count reaches 0, right ?
{code}
          if (entry.getValue().decRef() > 0) {
            connectionKey = null;
          } else break;
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027914#comment-13027914 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review632
-----------------------------------------------------------

Ship it!


I think a patch for 0.90 should be produced separately.
We have informed hbase users of this change. They would expect to benefit from it in 0.90

- Ted


On 2011-05-02 21:34:35, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-05-02 21:34:35)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411 
bq.    src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275 
bq.    src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 
bq.    src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026061#comment-13026061 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-28 00:21:00, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 130
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line130>
bq.  >
bq.  >     hashcode -> TableServers is not quite right. It's not a map of hashcode - that implies that two confs that happened to hash to the same code would share an HConnectionImpl, which isn't true.

Changed the comment to read "A LRU Map of HConnectionKey -> HConnection (TableServer)".


bq.  On 2011-04-28 00:21:00, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 179
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line179>
bq.  >
bq.  >     This increment has to be inside the synchronized (HBASE_INSTANCES) or else there's a race against deleteConnection.

I believe the increment is inside the synchronized block.


bq.  On 2011-04-28 00:21:00, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 411
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16911#file16911line411>
bq.  >
bq.  >     this atomicinteger isn't ever used atomically (it's always under a lock) so it could just be an int.

Will make it an int.


bq.  On 2011-04-28 00:21:00, Todd Lipcon wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 1361
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line1361>
bq.  >
bq.  >     these types of functions should be in try...finally. Otherwise getRegionCachePrefetch("table that does not exist") would leak a connection.
bq.  >     
bq.  >     Ideally we would haven implementation of HConnection called HConnectionRef which implements Closeable, so it would be the standard "get an object, use it, close it" type pattern. Calling deleteConnection just feels wrong.

I'll make sure the connection is cleaned up inside a finally block across the board. 

As far as the closeable is concerned, that was indeed implemented in an earlier version, but I went back to using deleteConnection based on your comment about having IOUtils.cleanUp method, which I think I must've misunderstood. In any case, I can add the closeable back. Finally (no pun intended), I'll make sure that if the close were to fail, that it wouldn't mask any exception in the try block, if any.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review593
-----------------------------------------------------------


On 2011-04-28 00:13:52, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-28 00:13:52)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 70affa0 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 250a8cf 
bq.    src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 04befe9 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 915cdf6 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026008#comment-13026008 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-27 01:33:01, Jean-Daniel Cryans wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java, line 513
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16918#file16918line513>
bq.  >
bq.  >     IIUC, we are creating an additional connection here since CT will do a getConnection with the passed conf instead of using a connection that the RS already has.

Please correct me if I'm wrong, but the RS creates the connection (at least the HConnection kind) just for the sake of CT. As a matter of fact, I was able to safely remove the RS#connection field altogether. What I should also have done, but forgot to do, was remove the call to delete the connection at the end of RS' run method. In the upcoming patch, the RS will not try to delete the connection, since it doesn't acquire it, at least not directly, in the first place. Now, the CT takes over the ownership of the connection resource.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review569
-----------------------------------------------------------


On 2011-04-27 18:33:01, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-27 18:33:01)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022622#comment-13022622 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

TestHBaseTestingUtility.multiClusters() uses the following as uniquifier:
{code}
    htu1.getConfiguration().set(HConstants.ZOOKEEPER_ZNODE_PARENT, "/1");
{code}
Once I added HConstants.ZOOKEEPER_ZNODE_PARENT to ONNECTION_PROPERTIES for HConnectionKey, TestHBaseTestingUtility passed.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13023189#comment-13023189 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I replaced the mocking of configuration with UTIL.getConfiguration().
Other tests in TestCatalogTracker passed except for testNoTimeoutWaitForMeta which hung:
{code}
2011-04-22 03:25:25,241 ERROR [main-EventThread] zookeeper.ClientCnxn$EventThread(532): Error while calling watcher 
java.lang.IllegalArgumentException: Can't build a writable with empty bytes array
	at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:123)
	at org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:102)
	at org.apache.hadoop.hbase.executor.RegionTransitionData.fromBytes(RegionTransitionData.java:238)
	at org.apache.hadoop.hbase.zookeeper.ZKUtil.logRetrievedMsg(ZKUtil.java:1124)
	at org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:550)
	at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.nodeCreated(ZooKeeperNodeTracker.java:149)
	at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:279)
	at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
	at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
{code}
I think the above is caused by ZKUtil.createAndFailSilent() passing empty data to zk.create() - not related to the changes of this JIRA.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026041#comment-13026041 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-27 23:03:49, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 261
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line261>
bq.  >
bq.  >     We should place deleteConnection() call in finally block because isTableEnabled() may throw IOException:
bq.  >     
bq.  >       public boolean isTableEnabled(byte[] tableName) throws IOException;
bq.  >     
bq.  >     Looks like we should use Connectable that Karthick proposed.

Yes, the delete now happens inside of a finally block in the current version, which was just rebased with the trunk, and is currently undergoing testing.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review591
-----------------------------------------------------------


On 2011-04-27 18:33:01, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-27 18:33:01)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026741#comment-13026741 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-28 22:14:13, Ted Yu wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java, line 338
bq.  > <https://reviews.apache.org/r/643/diff/5/?file=17633#file17633line338>
bq.  >
bq.  >     A log statement should be added for the case where connectSucceeded is false.

Will do. Note that the only reason the close method throws an IOException is because Closeable says so. In practice, the HConnectionImplementation#close() would not throw one, AFAIK.


- Karthick


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review607
-----------------------------------------------------------


On 2011-04-28 22:01:18, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-28 22:01:18)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 58c9153 
bq.    src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 64c14df 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d211b53 
bq.    src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 
bq.    src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java 126d9af 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment:     (was: HBASE-3777.patch)

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021901#comment-13021901 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

I took a stab at implementing the reference count in the {{HCM}}, but left the {{finalize}} in there as a last line of defense. In order for this to work, every connection acquired through {{HCM#getConnection}} must be closed when no longer needed. For more details, please take a look at the V3 version of the patch. This time around, I ran the tests on a Linux box, and saw a couple of test cases fail in {{TestReplication}} and {{TestHBaseTestingUtility}}, but they were caused by some flaky file access issues.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022795#comment-13022795 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

This failure is more interesting:
{code}
Tests run: 8, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 0.394 sec <<< FAILURE!
testThatIfMETAMovesWeAreNotified(org.apache.hadoop.hbase.catalog.TestCatalogTracker)  Time elapsed: 0.182 sec  <<< ERROR!
java.lang.NullPointerException
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:169)
        at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:407)
        at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:186)
        at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:127)
        at org.apache.hadoop.hbase.catalog.CatalogTracker.<init>(CatalogTracker.java:110)
        at org.apache.hadoop.hbase.catalog.TestCatalogTracker.constructAndStartCatalogTracker(TestCatalogTracker.java:102)
        at org.apache.hadoop.hbase.catalog.TestCatalogTracker.testThatIfMETAMovesWeAreNotified(TestCatalogTracker.java:115)
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025941#comment-13025941 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/
-----------------------------------------------------------

(Updated 2011-04-27 18:33:01.305553)


Review request for hbase and Ted Yu.


Summary (updated)
-------

Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".


This addresses bug HBASE-3777.
    https://issues.apache.org/jira/browse/HBASE-3777


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
  src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
  src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
  src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
  src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
  src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 

Diff: https://reviews.apache.org/r/643/diff


Testing
-------

mvn test


Thanks,

Karthick



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch, HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020870#comment-13020870 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

TestHFileOutputFormat passed on a Linux machine.
testVerifyRepJob still failed.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020911#comment-13020911 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

Talking about matching {{HConnectionManager#getConnection}} with {{HConnectionManager#deleteConnection}}, we now know why TableOutputFormat calls HConnectionManager.deleteAllConnections(true) because it's the easiest answer to connection leak.
I did hear someone complain about this call on IRC though. 

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020036#comment-13020036 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

{quote}the clause (thatValue == null) can be omitted.{quote}

For some reason, I can't log into https://reviews.apache.org/, so for now I've updated the patch per your comment above here. 

{quote}Running the new test, I got: ...FAILURES...{quote}

My bad. When I ran that test, I was pointing to a real server, when I should've built upon the test framework. I rewrote the test case in terms of the {{TEST_UTIL}} framework, and that seems to work as well.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch, HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025368#comment-13025368 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------



bq.  On 2011-04-25 20:05:54, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/client/HTable.java, line 259
bq.  > <https://reviews.apache.org/r/643/diff/3/?file=16912#file16912line259>
bq.  >
bq.  >     Yeah, this is ugly.... its almost as though you should have a special method for it, one that does not up the counters?
bq.  
bq.  Karthick Sankarachary wrote:
bq.      Just a thought - how about if we hide the ugliness in HCM, like so:
bq.      
bq.        public abstract class Connectable<T> {
bq.          public Configuration conf;
bq.      
bq.          public Connectable(Configuration conf) {
bq.            this.conf = conf;
bq.          }
bq.      
bq.          public abstract T connect(Connection connection);
bq.        }
bq.      
bq.        public static <T> T execute(Connectable<T> connectable) {
bq.          if (connectable == null || connectable.conf == null) {
bq.            return null;
bq.          }
bq.          HConfiguration conf = connectable.conf;
bq.          HConnection connection = HConnectionManager.getConnection(conf);
bq.          try {
bq.            return connectable.connect(connection);
bq.          } finally {
bq.            HConnectionManager.deleteConnection(conf, false);
bq.          }
bq.        }
bq.      
bq.      That way, the HTable call would look somewhat prettier:
bq.      
bq.        HConnectionManager.execute(new Connectable<Boolean>(conf) {
bq.          public Boolean connect(Connection connection) {
bq.            return connection.isTableEnabled(tableName);
bq.          }
bq.        });
bq.  
bq.  Karthick Sankarachary wrote:
bq.      BTW, if we bypass the reference counters in this situation, there's a chance, albeit small, that the connection might get closed by someone else while this guy is still trying to talk to it, which could result in a "connection is closed" type of error.

Your proposal is also ugly but I think less ugly than what we currently have so I would prefer it; it has the benefit of moving the ref counting back into HCM, not letting it out of the class (I'm fine w/ all your other comments Karthick)


- Michael


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/#review543
-----------------------------------------------------------


On 2011-04-22 21:16:59, Karthick Sankarachary wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/643/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-04-22 21:16:59)
bq.  
bq.  
bq.  Review request for hbase and Ted Yu.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
bq.  
bq.  Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".
bq.  
bq.  
bq.  This addresses bug HBASE-3777.
bq.      https://issues.apache.org/jira/browse/HBASE-3777
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/HConstants.java 5701639 
bq.    src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java be31179 
bq.    src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java afb666a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java c348f7a 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTable.java edacf56 
bq.    src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
bq.    src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 9e3f4d1 
bq.    src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
bq.    src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java 79a48ba 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java d0a1e11 
bq.    src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java 78c3b42 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 5da5e34 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java b624d28 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 7f5b377 
bq.    src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java e25184e 
bq.    src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 60320a3 
bq.    src/test/java/org/apache/hadoop/hbase/client/TestHCM.java b01a2d2 
bq.    src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 
bq.  
bq.  Diff: https://reviews.apache.org/r/643/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  mvn test
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Karthick
bq.  
bq.



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027844#comment-13027844 ] 

jiraposter@reviews.apache.org commented on HBASE-3777:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/643/
-----------------------------------------------------------

(Updated 2011-05-02 20:59:23.844076)


Review request for hbase and Ted Yu.


Changes
-------

I ran the test today after rebasing it (yet again), and this time there were no failures period.


Summary
-------

Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?

Here, I'd like to play devil's advocate and propose that we "deep-compare" HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is "concerned that a single HConnection is insufficient for sharing amongst clients", to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being "uniquely identifiable".


This addresses bug HBASE-3777.
    https://issues.apache.org/jira/browse/HBASE-3777


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b 
  src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f 
  src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 
  src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 
  src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb 
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 
  src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411 
  src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275 
  src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 
  src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b 
  src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 
  src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 
  src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 
  src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 
  src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb 
  src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e 
  src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 
  src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 
  src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b 
  src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb 

Diff: https://reviews.apache.org/r/643/diff


Testing
-------

mvn test


Thanks,

Karthick



> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020716#comment-13020716 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

hbase.zookeeper.property.clientPort should also be one of the connection-specific properties

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020859#comment-13020859 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I see TestHFileOutputFormat timeout:
{code} 
"main" prio=5 tid=101801000 nid=0x100601000 waiting on condition [1005ff000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.hadoop.hbase.client.HBaseAdmin.disableTable(HBaseAdmin.java:543)
        at org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.doIncrementalLoadTest(TestHFileOutputFormat.java:352)
        at org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat.testMRIncrementalLoadWithSplit(TestHFileOutputFormat.java:163)
{code}

Another test failure was:
{code}
-------------------------------------------------------------------------------
Test set: org.apache.hadoop.hbase.replication.TestReplication
-------------------------------------------------------------------------------
Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 110.1 sec <<< FAILURE!
testVerifyRepJob(org.apache.hadoop.hbase.replication.TestReplication)  Time elapsed: 8.15 sec  <<< FAILURE!
java.lang.AssertionError: expected:<0> but was:<100>
        at org.junit.Assert.fail(Assert.java:91)
        at org.junit.Assert.failNotEquals(Assert.java:645)
        at org.junit.Assert.assertEquals(Assert.java:126)
        at org.junit.Assert.assertEquals(Assert.java:470)
        at org.junit.Assert.assertEquals(Assert.java:454)
        at org.apache.hadoop.hbase.replication.TestReplication.testVerifyRepJob(TestReplication.java:510)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
{code}

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019597#comment-13019597 ] 

Karthick Sankarachary commented on HBASE-3777:
----------------------------------------------

That sounds like a plan. Are there any threads that talk about the error cases we run into in 0.89?

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019574#comment-13019574 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

For a), I like the idea of adding uniquifier to HBaseConfiguration. This is can be standardized through a well-known configuration parameter, such as "hbase.zookeeper.uniquifier" (a secondary key really).

For b), I don't have strong opinion about particular implementation. What I have yet to propose is that we can implement (optional) timeout mechanism for connections to address the issue under the thread "hbase -0.90.x upgrade - zookeeper exception in mapreduce job" on user mailing list.
Maybe it's easier to enforce timeout policy in HCM, hence the centralized reference counting.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020918#comment-13020918 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I wasn't sure about this claim about finalizer for Java 1.6 and beyond (http://forums.whirlpool.net.au/archive/754353):
In fact, it is perfectly permissible for a Java VM to *never* call it.

y.s.ramakrishna@oracle.com answered:

Yes; indeed, the spec is deliberately loose because it
is difficult in practice to implement any hard promptness
guarantees in general.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020124#comment-13020124 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

I think the reason outlined above makes case for at least duplicating the Configuration in HConnectionKey ctor.
Suppose client A comes with config and places an entry in HBASE_INSTANCES. Then client B comes with config, the previous entry would be returned to client B.
Now client A modifies one of the connection-specific properties - resulting in a change of the HConnectionKey for client B.

Since we expect the connections to be reused a lot (evidenced by TestHCM#testManyNewConnectionsDoesnotOOME), the cost of duplicating/making Configuration immutable is low.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Karthick Sankarachary (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Karthick Sankarachary updated HBASE-3777:
-----------------------------------------

    Attachment: HBASE-3777.patch

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration

Posted by "Ted Yu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13021905#comment-13021905 ] 

Ted Yu commented on HBASE-3777:
-------------------------------

There was one little conflict in HConnection.java where J-D recently put in:
{code}
  public int getCurrentNrHRS() throws IOException;
{code}
I will run tests on Linux.

> Redefine Identity Of HBase Configuration
> ----------------------------------------
>
>                 Key: HBASE-3777
>                 URL: https://issues.apache.org/jira/browse/HBASE-3777
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, ipc
>    Affects Versions: 0.90.2
>            Reporter: Karthick Sankarachary
>            Assignee: Karthick Sankarachary
>            Priority: Minor
>             Fix For: 0.92.0
>
>         Attachments: HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777.patch
>
>
> Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot?
> Here, I'd like to play devil's advocate and propose that we "deep-compare" {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is "concerned that a single {{HConnection}} is insufficient for sharing amongst clients",  to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being "uniquely identifiable".
> Note that "sharing connections makes clean up of {{HConnection}} instances a little awkward", unless of course, you apply the change described in HBASE-3766.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira