You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "mutianf (via GitHub)" <gi...@apache.org> on 2023/08/23 15:37:29 UTC

[GitHub] [beam] mutianf opened a new pull request, #28122: fix: fix a race condition in BigtableService cache

mutianf opened a new pull request, #28122:
URL: https://github.com/apache/beam/pull/28122

   **Please** add a meaningful description for your change here
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://github.com/apache/beam/blob/master/CONTRIBUTING.md#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI or the [workflows README](https://github.com/apache/beam/blob/master/.github/workflows/README.md) to see a list of phrases to trigger workflows.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1303307732


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -105,8 +106,12 @@ BigtableServiceEntry getServiceForReading(
       BigtableServiceEntry entry = entries.get(configId.id());
       if (entry != null) {
         // When entry is not null, refCount.get(configId.id()) should always exist.
-        // Do a getOrDefault to avoid unexpected NPEs.
-        refCounts.getOrDefault(configId.id(), new AtomicInteger(0)).getAndIncrement();
+        Preconditions.checkNotNull(
+            refCounts.get(configId.id()),

Review Comment:
   Though here is in synchronized block, assign a local variable to checkNotNull(refCounts.get(configId.id())) then call getAndIncrement on. this varaible. This is the canonical use of checkNotNull



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "bvolpato (via GitHub)" <gi...@apache.org>.
bvolpato commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1308904061


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -105,8 +104,12 @@ BigtableServiceEntry getServiceForReading(
       BigtableServiceEntry entry = entries.get(configId.id());
       if (entry != null) {
         // When entry is not null, refCount.get(configId.id()) should always exist.
-        // Do a getOrDefault to avoid unexpected NPEs.
-        refCounts.getOrDefault(configId.id(), new AtomicInteger(0)).getAndIncrement();
+        // Doing a putIfAbsent to avoid NPE.
+        AtomicInteger count = refCounts.putIfAbsent(configId.id(), new AtomicInteger(0));
+        if (count == null) {
+          LOG.error("entry is not null but refCount of config Id " + configId.id() + " is null.");
+        }

Review Comment:
   Scratch that, putIfAbsent returns the old object, not the new (thanks for pointing out).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1304869938


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -105,8 +104,12 @@ BigtableServiceEntry getServiceForReading(
       BigtableServiceEntry entry = entries.get(configId.id());
       if (entry != null) {
         // When entry is not null, refCount.get(configId.id()) should always exist.
-        // Do a getOrDefault to avoid unexpected NPEs.
-        refCounts.getOrDefault(configId.id(), new AtomicInteger(0)).getAndIncrement();
+        // Doing a putIfAbsent to avoid NPE.
+        AtomicInteger count = refCounts.putIfAbsent(configId.id(), new AtomicInteger(0));

Review Comment:
   this will definitely fix symptom. Though I cannot see if it could cause any potential connection leak. Let's see if @bvolpato 's opinion.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1303309112


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -105,8 +106,12 @@ BigtableServiceEntry getServiceForReading(
       BigtableServiceEntry entry = entries.get(configId.id());
       if (entry != null) {
         // When entry is not null, refCount.get(configId.id()) should always exist.
-        // Do a getOrDefault to avoid unexpected NPEs.
-        refCounts.getOrDefault(configId.id(), new AtomicInteger(0)).getAndIncrement();
+        Preconditions.checkNotNull(
+            refCounts.get(configId.id()),

Review Comment:
   I see nullness error is suppressed in this file. Since NPE caused issue, would be nice to clean up nullness errorprone in this file and remove suppress(nullness) in the class decorator



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "bvolpato (via GitHub)" <gi...@apache.org>.
bvolpato commented on PR #28122:
URL: https://github.com/apache/beam/pull/28122#issuecomment-1697553320

   @Abacn thanks for the double check, this should be good to go


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1303305418


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -75,15 +76,15 @@ static BigtableServiceEntry create(ConfigId configId, BigtableService service) {
 
     @Override
     public void close() {
-      int refCount =
-          refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
-      if (refCount < 0) {
-        LOG.error(
-            "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
-      }
-      LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
-      if (refCount == 0) {
-        synchronized (lock) {
+      synchronized (lock) {
+        int refCount =
+            refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
+        if (refCount < 0) {
+          LOG.error(
+              "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
+        }
+        LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
+        if (refCount == 0) {

Review Comment:
   concurrent programming can be tricky. It makes sense to me that move the lock here could resolve the race condition. However, is there any harm if make the entire close() function synchronized?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "bvolpato (via GitHub)" <gi...@apache.org>.
bvolpato commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1303312575


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -75,15 +76,15 @@ static BigtableServiceEntry create(ConfigId configId, BigtableService service) {
 
     @Override
     public void close() {
-      int refCount =
-          refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
-      if (refCount < 0) {
-        LOG.error(
-            "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
-      }
-      LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
-      if (refCount == 0) {
-        synchronized (lock) {
+      synchronized (lock) {
+        int refCount =
+            refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
+        if (refCount < 0) {
+          LOG.error(
+              "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
+        }
+        LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
+        if (refCount == 0) {

Review Comment:
    decrementAndCount is thread safe, so I wonder what's the actual underlying issue here.
   
   I also understand why `< 0` is logging an error, but shouldn't we trigger `.remove()` in that case too? Instead of doing `== 0` in the next if



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "bvolpato (via GitHub)" <gi...@apache.org>.
bvolpato commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1306601260


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -105,8 +104,12 @@ BigtableServiceEntry getServiceForReading(
       BigtableServiceEntry entry = entries.get(configId.id());
       if (entry != null) {
         // When entry is not null, refCount.get(configId.id()) should always exist.
-        // Do a getOrDefault to avoid unexpected NPEs.
-        refCounts.getOrDefault(configId.id(), new AtomicInteger(0)).getAndIncrement();
+        // Doing a putIfAbsent to avoid NPE.
+        AtomicInteger count = refCounts.putIfAbsent(configId.id(), new AtomicInteger(0));
+        if (count == null) {
+          LOG.error("entry is not null but refCount of config Id " + configId.id() + " is null.");
+        }

Review Comment:
   I believe you've experienced the racing and there was a reason to add it, though, so this should be fine -- but it appears that this branch is spurious / never reachable.
   
   It seems that what the previous code could simply be fixed by using putIfAbsent?
   
   ```
   refCounts.putIfAbsent(configId.id(), new AtomicInteger(0)).getAndIncrement();
   ```
   
   You may not even need `refCounts.put(configId.id(), new AtomicInteger(1));` which is done later, so you have only 1 AtomicInteger per config id
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "bvolpato (via GitHub)" <gi...@apache.org>.
bvolpato commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1308904061


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -105,8 +104,12 @@ BigtableServiceEntry getServiceForReading(
       BigtableServiceEntry entry = entries.get(configId.id());
       if (entry != null) {
         // When entry is not null, refCount.get(configId.id()) should always exist.
-        // Do a getOrDefault to avoid unexpected NPEs.
-        refCounts.getOrDefault(configId.id(), new AtomicInteger(0)).getAndIncrement();
+        // Doing a putIfAbsent to avoid NPE.
+        AtomicInteger count = refCounts.putIfAbsent(configId.id(), new AtomicInteger(0));
+        if (count == null) {
+          LOG.error("entry is not null but refCount of config Id " + configId.id() + " is null.");
+        }

Review Comment:
   Scratch that, putIfAbsent returns the old object, not the new.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "bvolpato (via GitHub)" <gi...@apache.org>.
bvolpato commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1303312575


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -75,15 +76,15 @@ static BigtableServiceEntry create(ConfigId configId, BigtableService service) {
 
     @Override
     public void close() {
-      int refCount =
-          refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
-      if (refCount < 0) {
-        LOG.error(
-            "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
-      }
-      LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
-      if (refCount == 0) {
-        synchronized (lock) {
+      synchronized (lock) {
+        int refCount =
+            refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
+        if (refCount < 0) {
+          LOG.error(
+              "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
+        }
+        LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
+        if (refCount == 0) {

Review Comment:
    decrementAndCount is thread safe, so I wonder what's the actual underlying issue here.
   
   I also understand why < 0 is logging an error, but shouldn't we `.remove()` in that case too? Instead of doing `== 0` in the next if



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #28122:
URL: https://github.com/apache/beam/pull/28122#issuecomment-1690282344

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @bvolpato for label java.
   R: @johnjcasey for label io.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] Abacn merged pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "Abacn (via GitHub)" <gi...@apache.org>.
Abacn merged PR #28122:
URL: https://github.com/apache/beam/pull/28122


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mutianf commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "mutianf (via GitHub)" <gi...@apache.org>.
mutianf commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1304671705


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -75,15 +76,15 @@ static BigtableServiceEntry create(ConfigId configId, BigtableService service) {
 
     @Override
     public void close() {
-      int refCount =
-          refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
-      if (refCount < 0) {
-        LOG.error(
-            "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
-      }
-      LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
-      if (refCount == 0) {
-        synchronized (lock) {
+      synchronized (lock) {
+        int refCount =
+            refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
+        if (refCount < 0) {
+          LOG.error(
+              "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
+        }
+        LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
+        if (refCount == 0) {

Review Comment:
   I think it's fine, originally the part that's outside of the lock was also just getting the ref count.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mutianf commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "mutianf (via GitHub)" <gi...@apache.org>.
mutianf commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1304672738


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -105,8 +106,12 @@ BigtableServiceEntry getServiceForReading(
       BigtableServiceEntry entry = entries.get(configId.id());
       if (entry != null) {
         // When entry is not null, refCount.get(configId.id()) should always exist.
-        // Do a getOrDefault to avoid unexpected NPEs.
-        refCounts.getOrDefault(configId.id(), new AtomicInteger(0)).getAndIncrement();
+        Preconditions.checkNotNull(
+            refCounts.get(configId.id()),

Review Comment:
   I don't thnk i'm able to remove the suppress(nullness) annotation on the class. If I removed it doing:
   
   ```
         BigtableServiceEntry entry = entries.get(configId.id());
         if (entry != null) {
   ```
   
   Won't build



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mutianf commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "mutianf (via GitHub)" <gi...@apache.org>.
mutianf commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1303444386


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -75,15 +76,15 @@ static BigtableServiceEntry create(ConfigId configId, BigtableService service) {
 
     @Override
     public void close() {
-      int refCount =
-          refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
-      if (refCount < 0) {
-        LOG.error(
-            "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
-      }
-      LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
-      if (refCount == 0) {
-        synchronized (lock) {
+      synchronized (lock) {
+        int refCount =
+            refCounts.getOrDefault(getConfigId().id(), new AtomicInteger(0)).decrementAndGet();
+        if (refCount < 0) {
+          LOG.error(
+              "close() Ref count is < 0, configId=" + getConfigId().id() + " refCount=" + refCount);
+        }
+        LOG.debug("close() for config id " + getConfigId().id() + ", ref count is " + refCount);
+        if (refCount == 0) {

Review Comment:
   I think the underlying issue is the race between:
    - close() decrementing the counter and
    - getServiceForReading() incrementing the counter
   Since decrementing the counter is outside of the lock, and we're getting the refCount number and do something with it, there could be a race. 
   
   When `<0`, that means refCounts doesn't has a reference to configId, so we're not triggering the remove there. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "bvolpato (via GitHub)" <gi...@apache.org>.
bvolpato commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1306601326


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -150,8 +153,12 @@ BigtableServiceEntry getServiceForWriting(
       LOG.debug("getServiceForWriting(), config id: " + configId.id());
       if (entry != null) {
         // When entry is not null, refCount.get(configId.id()) should always exist.
-        // Do a getOrDefault to avoid unexpected NPEs.
-        refCounts.getOrDefault(configId.id(), new AtomicInteger(0)).getAndIncrement();
+        // Doing a putIfAbsent to avoid NPE.
+        AtomicInteger count = refCounts.putIfAbsent(configId.id(), new AtomicInteger(0));

Review Comment:
   Same here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #28122: fix: fix a race condition in BigtableService cache

Posted by "bvolpato (via GitHub)" <gi...@apache.org>.
bvolpato commented on code in PR #28122:
URL: https://github.com/apache/beam/pull/28122#discussion_r1306601260


##########
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableServiceFactory.java:
##########
@@ -105,8 +104,12 @@ BigtableServiceEntry getServiceForReading(
       BigtableServiceEntry entry = entries.get(configId.id());
       if (entry != null) {
         // When entry is not null, refCount.get(configId.id()) should always exist.
-        // Do a getOrDefault to avoid unexpected NPEs.
-        refCounts.getOrDefault(configId.id(), new AtomicInteger(0)).getAndIncrement();
+        // Doing a putIfAbsent to avoid NPE.
+        AtomicInteger count = refCounts.putIfAbsent(configId.id(), new AtomicInteger(0));
+        if (count == null) {
+          LOG.error("entry is not null but refCount of config Id " + configId.id() + " is null.");
+        }

Review Comment:
   I believe you've experienced the racing and there was a reason to add it, though, so this should be fine -- but it appears that this branch is spurious / never reachable.
   
   It appears that what the previous code could simply be fixed by using putIfAbsent?
   
   ```
   refCounts.putIfAbsent(configId.id(), new AtomicInteger(0)).getAndIncrement();
   ```
   
   You may not even need `refCounts.put(configId.id(), new AtomicInteger(1));` which is done later, so you have only 1 AtomicInteger per config id
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org