You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "stillalex (via GitHub)" <gi...@apache.org> on 2023/03/31 18:47:33 UTC

[GitHub] [solr] stillalex commented on pull request #1504: SOLR-7609 ShardSplitTest NPE investigation

stillalex commented on PR #1504:
URL: https://github.com/apache/solr/pull/1504#issuecomment-1492441639

   I have pushed a tentative patch for review and more importantly for discussion.
   
   I have ran this test extensively (hundreds of beast jobs) and it looks like this is the way to go, all 'add' exceptions are gone once the isLeader flag is updated. BUT I am not completely convinced this is the correct patch. possibly something could be made better in the DistributedZkUpdateProcessor area instead (where a lot of leader calculations are being made but the result is not correct in this edge case).
   While I am still evaluating a patch for DistributedZkUpdateProcessor, I would love some feedback on the observations so far.
   
   Together with the patch I have added a check which will not allow additions with no version. this is now consistent with the delete flows. although I must admit I am confused by the fact that this check was missing on this execution path, while the delete path had this from at least 8 years ago.
   
   Other points
    - in the current test, there are a few operations (add and deletes) that run parallel to the split operation. these can fail and this is where the bug is. because the exceptions are not correctly accounted for, the NPE popped up at a weird place. I added tracking and assertions on this part too (except for deletes, see below)
    - deletes can also cause exceptions, I don't have a good handle on how to fix this, so I removed the strict error check. this can be added once the confidence level is higher and this flow is accounted for. until then I will focus on the addition parts.
    - DistributedZkUpdateProcessor will call setupRequest twice. this is probably a minor nuisance for some future improvement PR.
    - finding the code very hard to read I took the decision to remove one variable that is not really needed (the `updateCommand`), this change is purely cosmetic and I can revert it if it adds too much noise to the current patch.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org