You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Aleksey Yeschenko (Jira)" <ji...@apache.org> on 2021/10/20 12:25:00 UTC

[jira] [Commented] (CASSANDRA-17049) Fix rare NPE caused by batchlog replay / node decomission races

    [ https://issues.apache.org/jira/browse/CASSANDRA-17049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17431181#comment-17431181 ] 

Aleksey Yeschenko commented on CASSANDRA-17049:
-----------------------------------------------

3.0: [code|https://github.com/iamaleksey/cassandra/commits/17049-3.0], [ci|https://app.circleci.com/pipelines/github/iamaleksey/cassandra?branch=17049-3.0]
3.11: [code|https://github.com/iamaleksey/cassandra/commits/17049-3.11], [ci|https://app.circleci.com/pipelines/github/iamaleksey/cassandra?branch=17049-3.11]
4.0: [code|https://github.com/iamaleksey/cassandra/commits/17049-4.0], [ci|https://app.circleci.com/pipelines/github/iamaleksey/cassandra?branch=17049-4.0]
trunk: [code|https://github.com/iamaleksey/cassandra/commits/17049-trunk], [ci|https://app.circleci.com/pipelines/github/iamaleksey/cassandra?branch=17049-trunk]

Changes are covered by existing batch log testes.

> Fix rare NPE caused by batchlog replay / node decomission races
> ---------------------------------------------------------------
>
>                 Key: CASSANDRA-17049
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17049
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Batch Log, Consistency/Hints
>            Reporter: Aleksey Yeschenko
>            Assignee: Aleksey Yeschenko
>            Priority: Low
>             Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>
> Batchlog replay process collects addresses of the hosts that have been hinted to, so it can flush hints for them to disk before confirming deletion of the replayed batches. If a node has been decommissioned during replay, however, when the time comes to flush the hints at the very end of replay, {{StorageService.getHostIdForEndpoint()}} will return {{null}} for its address, which will, down the line, cause {{HintsCatalog::get()}} to be invoked with a {{null}} host id argument, causing an NPE.
> The simple fix is to check returned host ids for addresses for nulls, and collect hinted host ids instead of hinted addresses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org