You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by GitBox <gi...@apache.org> on 2020/08/17 21:11:37 UTC

[GitHub] [phoenix] swaroopak opened a new pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

swaroopak opened a new pull request #860:
URL: https://github.com/apache/phoenix/pull/860


   …ng before every inner batch


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on pull request #860:
URL: https://github.com/apache/phoenix/pull/860#issuecomment-677861946


   @gjacoby126, I am not sure if your suggestion improves something. Why should not IndexRebuidRegionScanner  be concerned or know about that a data table is splitting or closing? It directly works on regions, it knows region boundaries, it knows that it is running within a coproc, etc. Actually this bug is introduced because we removed the references to UARO from the IndexRebuidRegionScanner and ignored region split or close operations.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on a change in pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on a change in pull request #860:
URL: https://github.com/apache/phoenix/pull/860#discussion_r471829610



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -1301,11 +1301,24 @@ public boolean next(List<Cell> results) throws IOException {
         try {
             localScanner = getLocalScanner();
             if (localScanner == null) {
+                region.closeRegionOperation();
                 return false;
             }
             synchronized (localScanner) {
                 if (!shouldVerify()) {
                     skipped = true;
+                    region.closeRegionOperation();

Review comment:
       @swaroopak - do we need the region.closeRegionOperation() and localScanner.close() calls? They appear to already be called in the finally block, which will execute even if we return false early. And a finally block seems like a safer place for them so some other code path doesn't miss them in the future. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak merged pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
swaroopak merged pull request #860:
URL: https://github.com/apache/phoenix/pull/860


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on pull request #860:
URL: https://github.com/apache/phoenix/pull/860#issuecomment-677915427


   I do not see any relation between Single Responsibility Principle and what we are talking about. What we are talking about how IndexRebuildRegionScanner can use the functionality provided by UARO. There is no responsibility sharing in terms of detecting region closing and splits and actions to be taken after. In this PR, IndexRebuildRegionScanner scanner knows about UARO and uses the functionality UARO implements. You suggest that IndexRebuildRegionScanner should not know about UARO,  should not know about region close and splits, and should be instructed to stop its operation. However, it should somehow know magically what to do when it stops its operation. Sorry, that does not make sense to me. Making this implicit for the sake of decoupling and abstracting does not work in this case and there is no problem with the current client  /user (IndexRebuildRegionScanner) and server/provider (UARO) model used here as far as I see.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #860:
URL: https://github.com/apache/phoenix/pull/860#issuecomment-677889679


   @kadirozde - Single Responsibility Principle. With this PR, the responsibility for handling closes and splits is spread between both UARO and the rebuild scanner. If someday we refactor UARO so that upsert selects and batch deletes have their own classes (which we really ought to do eventually), then we'd have to have special logic for splits / closes in 4 places. 
   
   In general, this class design of having both UARO having a pointer to the scanner and the scanner having a pointer to UARO is a code smell; it indicates the classes are too tightly coupled. It's better if the scanner doesn't know about UARO, or splits or other region operations, but just has a notion that it might be canceled by an external process. This puts the UARO in the position of a task manager, and the scanner as a task worker, with a clean separation between them. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #860:
URL: https://github.com/apache/phoenix/pull/860#issuecomment-677969630


   Simplifying the testing is a bonus, not the reason. Code classes should be highly cohesive and loosely coupled. Standard OO principles. :-) This change makes the rebuild scanner less cohesive and more tightly coupled to UARO. 
   
   I'm not really sure why (isClosed()) { throw new IOException("Some informative message") }, where isClosed() checks some AtomicBoolean local to the scanner, would be harder to read than a call to UARO.checkRegionClosingOrSplitting. It does add some bookkeeping to UARO, but it already has a spot for that in the waitForScanners method. 
   
   That said, I will not veto this patch as it is if you approve it, @kadirozde .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #860:
URL: https://github.com/apache/phoenix/pull/860#issuecomment-677926205


   IndexRebuildRegionScanner doesn't _need_ to know that it's in a bad state because the region is splitting or closing. It just needs to know that it's in a bad state, and to throw an exception. It's a scanner -- at a high level, it should understand open, close, and next(). Scanners and iterators don't usually close _themselves_, some external class does. 
   
   SRP says that each class should have one reason for existing, and one reason to change. Say HBase 3 adds a new event we'd want to stop the scanner for that's not covered by UARO.checkForRegionClosingOrSplitting(). You'd have to modify IndexRebuildRegionScanner in a bunch of places to call UARO.checkForNewEventState(). On the other hand, if sending a close or kill signal to the scanner is UARO's responsibility, then adding a new region event just changes UARO, and the scanner doesn't change. 
   
   It's a cleaner separation of concerns. (And also makes this easier to unit test)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on pull request #860:
URL: https://github.com/apache/phoenix/pull/860#issuecomment-677946799


   @gjacoby126, Now you are asking to complicate the current implementation for the sake of simplifying some minor testing. I still see that it will make things implicit and make the code harder to read. If you really want to still push this, then work with @swaroopak to implement it. I will review the change after that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak commented on pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
swaroopak commented on pull request #860:
URL: https://github.com/apache/phoenix/pull/860#issuecomment-678368646


   Thank you for the comments and review @gjacoby126 and @kadirozde . The pre-commit build is disabled https://builds.apache.org/job/PreCommit-PHOENIX-Build/ So the tests are running locally on the patch. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #860:
URL: https://github.com/apache/phoenix/pull/860#issuecomment-677890317


   The current design here also means that when reasoning about deadlocks and concurrency, you have to look at code in multiple classes, because the IndexRebuildRegionScanner has to grab a lock in UARO. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #860:
URL: https://github.com/apache/phoenix/pull/860#discussion_r471838380



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -1301,11 +1301,24 @@ public boolean next(List<Cell> results) throws IOException {
         try {
             localScanner = getLocalScanner();
             if (localScanner == null) {
+                region.closeRegionOperation();
                 return false;
             }
             synchronized (localScanner) {
                 if (!shouldVerify()) {
                     skipped = true;
+                    region.closeRegionOperation();

Review comment:
       As I stated before, the fix is to check the isRegionClosingOrSplitting attribute of the UngroupedAggregateRegionObserver coproc and return an IOException after every scanner next or table batch operation done during index rebuild or verification if  the region is closing or splitting.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] gjacoby126 commented on pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
gjacoby126 commented on pull request #860:
URL: https://github.com/apache/phoenix/pull/860#issuecomment-677833283


   @kadirozde @swaroopak - see the reference counting mechanism in UngroupedAggregateRegionObserver.waitForScansToFinish, which is called by preClose and preSplit. (Also note that it takes a lock and holds it while waiting for scans to finish, and that the isRegionClosingOrSplitting flag is always accessed using that lock, so the calls to UARO.checkForRegionClosing() might be waiting on it.)
   
   In the current PR, the split / close logic is divided between UARO and the IndexRebuild scanner, and the latter ideally shouldn't have to worry about that. 
   
   Wouldn't it be cleaner if, rather than passing the UARO to the IndexRebuild scanner, the UARO instead kept a reference to the rebuild scanner and sent it a kill signal in waitForScansToFinish()? Then all the places in this PR that are calling UARO.checkForRegionClosing() could just instead check if their kill flag was flipped, and shut down / send an exception as needed. We already have a reference counting mechanism, why not use it? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] swaroopak commented on a change in pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
swaroopak commented on a change in pull request #860:
URL: https://github.com/apache/phoenix/pull/860#discussion_r474239598



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -899,6 +906,7 @@ private void rebuildIndexRows(Map<byte[], List<Mutation>> indexKeyToMutationMap,
                 }
             }
             if (batchSize > 0) {
+                ungroupedAggregateRegionObserver.checkForRegionClosing();

Review comment:
       good catch! Thanks!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix] kadirozde commented on a change in pull request #860: PHOENIX-6080: Add a check to Index Rebuild jobs to check region closi…

Posted by GitBox <gi...@apache.org>.
kadirozde commented on a change in pull request #860:
URL: https://github.com/apache/phoenix/pull/860#discussion_r474238774



##########
File path: phoenix-core/src/main/java/org/apache/phoenix/coprocessor/IndexRebuildRegionScanner.java
##########
@@ -899,6 +906,7 @@ private void rebuildIndexRows(Map<byte[], List<Mutation>> indexKeyToMutationMap,
                 }
             }
             if (batchSize > 0) {
+                ungroupedAggregateRegionObserver.checkForRegionClosing();

Review comment:
       You missed the for loop before this (line 903)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org