You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Bingqin Zhou <bi...@wepay.com> on 2021/04/28 22:22:52 UTC

Cassandra doesn't flush any commit log files into cdc_raw directory

Hi all,

We're working on a Kafka connector to capture data changes in Cassandra by
processing commit log files in the cdc_raw directory. After we enabled CDC
on a few tables, we didn't observe any commit log files getting flushed
into cdc_raw directory as expected, but got WriteTimeoutException in
Cassandra DB.

Here's how we reproduce the issue:

1. Our Cassandra Settings:

- Cassandra Version: 3.11.9
- Related configs in Cassandra.yaml:
   - cdc_enabled: true
   - cdc_total_space_in_mb: 4096
   - commitlog_segment_size_in_mb: 32mb
   - commitlog_total_space_in_mb: 8192
   - commitlog_sync: periodic
   - commitlog_sync_period_in_ms: 10000

2. Enable CDC on a few tables by CQL:
  ALTER TABLE foo WITH cdc=true;

3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
However at the same time, cdc_raw directory is still empty with no commit
log flushed/copied into it at all.

I want to understand why there's no commit log file flushed into
cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
been reached and write suspension has been triggered in Cassandra DB. This
sounds like a bug and currently makes the CDC feature useless.

Thanks so much,
Bingqin Zhou

Re: Cassandra doesn't flush any commit log files into cdc_raw directory

Posted by Bingqin Zhou <bi...@wepay.com>.
Hi Ahmed,

Thank you for your insights! I don't think the write rates are slow for the
tables I enable CDC on, otherwise, the commit log sizes won't go over the
cdc_total_space_in_mb (4096) quickly. I'll try to dig more into what
affects the speed of memtable flushes.

Bingqin Zhou

On Thu, Apr 29, 2021 at 1:11 AM Ahmed Eljami <ah...@gmail.com> wrote:

> Hi  Bingqin,
>
> When cdc_raw directory is full, Cassandra rejects new writes on this node
> with the following message in the log:
>
> - Rejecting Mutation containing CDC-enabled table.....
>
> https://github.com/apache/cassandra/blob/cassandra-3.11.9/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java#L125
>
> So, I'm not sorry if WriteTimeoutException is related to cdc feature in
> your case.
>
> For your issue with no commitLog flushed/copied into cdc_raw directory,
> can you try with an explicit nodetool flush and see if commitLog will be
> transferred ? For tables with a "very low" write rates, memtable can take a
> lot of time before be flushed on disk.
>
> Cheers,
> Ahmed
>
>
> Le jeu. 29 avr. 2021 à 00:23, Bingqin Zhou <bi...@wepay.com.invalid> a
> écrit :
>
>> Hi all,
>>
>> We're working on a Kafka connector to capture data changes in Cassandra by
>> processing commit log files in the cdc_raw directory. After we enabled CDC
>> on a few tables, we didn't observe any commit log files getting flushed
>> into cdc_raw directory as expected, but got WriteTimeoutException in
>> Cassandra DB.
>>
>> Here's how we reproduce the issue:
>>
>> 1. Our Cassandra Settings:
>>
>> - Cassandra Version: 3.11.9
>> - Related configs in Cassandra.yaml:
>>    - cdc_enabled: true
>>    - cdc_total_space_in_mb: 4096
>>    - commitlog_segment_size_in_mb: 32mb
>>    - commitlog_total_space_in_mb: 8192
>>    - commitlog_sync: periodic
>>    - commitlog_sync_period_in_ms: 10000
>>
>> 2. Enable CDC on a few tables by CQL:
>>   ALTER TABLE foo WITH cdc=true;
>>
>> 3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
>> However at the same time, cdc_raw directory is still empty with no commit
>> log flushed/copied into it at all.
>>
>> I want to understand why there's no commit log file flushed into
>> cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
>> been reached and write suspension has been triggered in Cassandra DB. This
>> sounds like a bug and currently makes the CDC feature useless.
>>
>> Thanks so much,
>> Bingqin Zhou
>>
>
>
> --
> Cordialement;
>
> Ahmed ELJAMI
>

Re: Cassandra doesn't flush any commit log files into cdc_raw directory

Posted by Bingqin Zhou <bi...@wepay.com.INVALID>.
Hi Ahmed,

Thank you for your insights! I don't think the write rates are slow for the
tables I enable CDC on, otherwise, the commit log sizes won't go over the
cdc_total_space_in_mb (4096) quickly. I'll try to dig more into what
affects the speed of memtable flushes.

Bingqin Zhou

On Thu, Apr 29, 2021 at 1:11 AM Ahmed Eljami <ah...@gmail.com> wrote:

> Hi  Bingqin,
>
> When cdc_raw directory is full, Cassandra rejects new writes on this node
> with the following message in the log:
>
> - Rejecting Mutation containing CDC-enabled table.....
>
> https://github.com/apache/cassandra/blob/cassandra-3.11.9/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java#L125
>
> So, I'm not sorry if WriteTimeoutException is related to cdc feature in
> your case.
>
> For your issue with no commitLog flushed/copied into cdc_raw directory,
> can you try with an explicit nodetool flush and see if commitLog will be
> transferred ? For tables with a "very low" write rates, memtable can take a
> lot of time before be flushed on disk.
>
> Cheers,
> Ahmed
>
>
> Le jeu. 29 avr. 2021 à 00:23, Bingqin Zhou <bi...@wepay.com.invalid> a
> écrit :
>
>> Hi all,
>>
>> We're working on a Kafka connector to capture data changes in Cassandra by
>> processing commit log files in the cdc_raw directory. After we enabled CDC
>> on a few tables, we didn't observe any commit log files getting flushed
>> into cdc_raw directory as expected, but got WriteTimeoutException in
>> Cassandra DB.
>>
>> Here's how we reproduce the issue:
>>
>> 1. Our Cassandra Settings:
>>
>> - Cassandra Version: 3.11.9
>> - Related configs in Cassandra.yaml:
>>    - cdc_enabled: true
>>    - cdc_total_space_in_mb: 4096
>>    - commitlog_segment_size_in_mb: 32mb
>>    - commitlog_total_space_in_mb: 8192
>>    - commitlog_sync: periodic
>>    - commitlog_sync_period_in_ms: 10000
>>
>> 2. Enable CDC on a few tables by CQL:
>>   ALTER TABLE foo WITH cdc=true;
>>
>> 3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
>> However at the same time, cdc_raw directory is still empty with no commit
>> log flushed/copied into it at all.
>>
>> I want to understand why there's no commit log file flushed into
>> cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
>> been reached and write suspension has been triggered in Cassandra DB. This
>> sounds like a bug and currently makes the CDC feature useless.
>>
>> Thanks so much,
>> Bingqin Zhou
>>
>
>
> --
> Cordialement;
>
> Ahmed ELJAMI
>

Re: Cassandra doesn't flush any commit log files into cdc_raw directory

Posted by Ahmed Eljami <ah...@gmail.com>.
Hi  Bingqin,

When cdc_raw directory is full, Cassandra rejects new writes on this node
with the following message in the log:

- Rejecting Mutation containing CDC-enabled table.....
https://github.com/apache/cassandra/blob/cassandra-3.11.9/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java#L125

So, I'm not sorry if WriteTimeoutException is related to cdc feature in
your case.

For your issue with no commitLog flushed/copied into cdc_raw directory, can
you try with an explicit nodetool flush and see if commitLog will be
transferred ? For tables with a "very low" write rates, memtable can take a
lot of time before be flushed on disk.

Cheers,
Ahmed


Le jeu. 29 avr. 2021 à 00:23, Bingqin Zhou <bi...@wepay.com.invalid> a
écrit :

> Hi all,
>
> We're working on a Kafka connector to capture data changes in Cassandra by
> processing commit log files in the cdc_raw directory. After we enabled CDC
> on a few tables, we didn't observe any commit log files getting flushed
> into cdc_raw directory as expected, but got WriteTimeoutException in
> Cassandra DB.
>
> Here's how we reproduce the issue:
>
> 1. Our Cassandra Settings:
>
> - Cassandra Version: 3.11.9
> - Related configs in Cassandra.yaml:
>    - cdc_enabled: true
>    - cdc_total_space_in_mb: 4096
>    - commitlog_segment_size_in_mb: 32mb
>    - commitlog_total_space_in_mb: 8192
>    - commitlog_sync: periodic
>    - commitlog_sync_period_in_ms: 10000
>
> 2. Enable CDC on a few tables by CQL:
>   ALTER TABLE foo WITH cdc=true;
>
> 3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
> However at the same time, cdc_raw directory is still empty with no commit
> log flushed/copied into it at all.
>
> I want to understand why there's no commit log file flushed into
> cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
> been reached and write suspension has been triggered in Cassandra DB. This
> sounds like a bug and currently makes the CDC feature useless.
>
> Thanks so much,
> Bingqin Zhou
>


-- 
Cordialement;

Ahmed ELJAMI

Re: Cassandra doesn't flush any commit log files into cdc_raw directory

Posted by Ahmed Eljami <ah...@gmail.com>.
Hi  Bingqin,

When cdc_raw directory is full, Cassandra rejects new writes on this node
with the following message in the log:

- Rejecting Mutation containing CDC-enabled table.....
https://github.com/apache/cassandra/blob/cassandra-3.11.9/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java#L125

So, I'm not sorry if WriteTimeoutException is related to cdc feature in
your case.

For your issue with no commitLog flushed/copied into cdc_raw directory, can
you try with an explicit nodetool flush and see if commitLog will be
transferred ? For tables with a "very low" write rates, memtable can take a
lot of time before be flushed on disk.

Cheers,
Ahmed


Le jeu. 29 avr. 2021 à 00:23, Bingqin Zhou <bi...@wepay.com.invalid> a
écrit :

> Hi all,
>
> We're working on a Kafka connector to capture data changes in Cassandra by
> processing commit log files in the cdc_raw directory. After we enabled CDC
> on a few tables, we didn't observe any commit log files getting flushed
> into cdc_raw directory as expected, but got WriteTimeoutException in
> Cassandra DB.
>
> Here's how we reproduce the issue:
>
> 1. Our Cassandra Settings:
>
> - Cassandra Version: 3.11.9
> - Related configs in Cassandra.yaml:
>    - cdc_enabled: true
>    - cdc_total_space_in_mb: 4096
>    - commitlog_segment_size_in_mb: 32mb
>    - commitlog_total_space_in_mb: 8192
>    - commitlog_sync: periodic
>    - commitlog_sync_period_in_ms: 10000
>
> 2. Enable CDC on a few tables by CQL:
>   ALTER TABLE foo WITH cdc=true;
>
> 3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
> However at the same time, cdc_raw directory is still empty with no commit
> log flushed/copied into it at all.
>
> I want to understand why there's no commit log file flushed into
> cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
> been reached and write suspension has been triggered in Cassandra DB. This
> sounds like a bug and currently makes the CDC feature useless.
>
> Thanks so much,
> Bingqin Zhou
>


-- 
Cordialement;

Ahmed ELJAMI