You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Paul Chandler <pa...@redshots.com> on 2021/11/15 10:43:00 UTC

Hint file getting stuck

Hi all

We keep having a problem with hint files on one of our Cassandra nodes (v 3.11.6 ), there keeps being the following error messages repeated for same file.

INFO [HintsDispatcher:25] 2021-11-02 08:55:29,830 HintsDispatchExecutor.java:289 - Finished hinted handoff of file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to endpoint /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
INFO [HintsDispatcher:24] 2021-11-02 08:55:39,812 HintsDispatchExecutor.java:289 - Finished hinted handoff of file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to endpoint /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
INFO [HintsDispatcher:25] 2021-11-02 08:55:49,822 HintsDispatchExecutor.java:289 - Finished hinted handoff of file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to endpoint /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially

On the receiving node ( cassandra0 ) we see the CPU shoot up, this is how notice we have a problem.

This has happened serval times with different files, and we find the only way to stop this is to delete the offending hint files.

The cluster can be a bit overloaded, and this is what is causing the hint files to be generated in the first place, we are working to get that stopped,  However the question I don’t know the answer to is what causing this “partially” hint processing and how can we stop it happening?

Thanks

Paul

Re: Hint file getting stuck

Posted by Sandeep Nethi <ne...@gmail.com>.
It could also be related to the mutation size of your writes in hint files.
Are you seeing any mutation warnings? if yes, temporarily increasing the
commitlog segment size would help to solve the problem.

Thanks,

On Tue, Nov 16, 2021 at 1:29 PM Bowen Song <bo...@bso.ng> wrote:

> I think your problem is likely not the "stuck" hints, but the write
> requests in them.
>
> The reason those write requests ended up in the hint file is because they
> have failed before. They are likely to fail again when they are retried if
> the failure was caused by the write requests themselves instead of some
> network issues or nodes temporary overloaded by other queries.
>
>
> On 15/11/2021 10:43, Paul Chandler wrote:
>
> Hi all
>
> We keep having a problem with hint files on one of our Cassandra nodes (v
> 3.11.6 ), there keeps being the following error messages repeated for same
> file.
>
> INFO [HintsDispatcher:25] 2021-11-02
> 08:55:29,830 HintsDispatchExecutor.java:289 - Finished hinted handoff of
> file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to endpoint
> /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
> INFO [HintsDispatcher:24] 2021-11-02
> 08:55:39,812 HintsDispatchExecutor.java:289 - Finished hinted handoff of
> file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to endpoint
> /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
> INFO [HintsDispatcher:25] 2021-11-02
> 08:55:49,822 HintsDispatchExecutor.java:289 - Finished hinted handoff of
> file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to endpoint
> /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
>
> On the receiving node ( cassandra0 ) we see the CPU shoot up, this is how
> notice we have a problem.
>
> This has happened serval times with different files, and we find the only
> way to stop this is to delete the offending hint files.
>
> The cluster can be a bit overloaded, and this is what is causing the hint
> files to be generated in the first place, we are working to get that
> stopped,  However the question I don’t know the answer to is what causing
> this “partially” hint processing and how can we stop it happening?
>
> Thanks
>
> Paul
>
>

Re: Hint file getting stuck

Posted by Bowen Song <bo...@bso.ng>.
I think your problem is likely not the "stuck" hints, but the write 
requests in them.

The reason those write requests ended up in the hint file is because 
they have failed before. They are likely to fail again when they are 
retried if the failure was caused by the write requests themselves 
instead of some network issues or nodes temporary overloaded by other 
queries.


On 15/11/2021 10:43, Paul Chandler wrote:
> Hi all
>
> We keep having a problem with hint files on one of our Cassandra nodes 
> (v 3.11.6 ), there keeps being the following error messages repeated 
> for same file.
>
> INFO [HintsDispatcher:25] 2021-11-02 
> 08:55:29,830 HintsDispatchExecutor.java:289 - Finished hinted handoff 
> of file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to 
> endpoint /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
> INFO [HintsDispatcher:24] 2021-11-02 
> 08:55:39,812 HintsDispatchExecutor.java:289 - Finished hinted handoff 
> of file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to 
> endpoint /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
> INFO [HintsDispatcher:25] 2021-11-02 
> 08:55:49,822 HintsDispatchExecutor.java:289 - Finished hinted handoff 
> of file 72a18469-b7d2-499a-aed3-fd4e2cda9678-1635838529279-1.hints to 
> endpoint /10.29.49.210: 72a18469-b7d2-499a-aed3-fd4e2cda9678, partially
>
> On the receiving node ( cassandra0 ) we see the CPU shoot up, this is 
> how notice we have a problem.
>
> This has happened serval times with different files, and we find the 
> only way to stop this is to delete the offending hint files.
>
> The cluster can be a bit overloaded, and this is what is causing the 
> hint files to be generated in the first place, we are working to get 
> that stopped,  However the question I don’t know the answer to is what 
> causing this “partially” hint processing and how can we stop it happening?
>
> Thanks
>
> Paul