You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Bingqin Zhou <bi...@wepay.com.INVALID> on 2021/04/28 22:22:52 UTC
Cassandra doesn't flush any commit log files into cdc_raw directory
Hi all,
We're working on a Kafka connector to capture data changes in Cassandra by
processing commit log files in the cdc_raw directory. After we enabled CDC
on a few tables, we didn't observe any commit log files getting flushed
into cdc_raw directory as expected, but got WriteTimeoutException in
Cassandra DB.
Here's how we reproduce the issue:
1. Our Cassandra Settings:
- Cassandra Version: 3.11.9
- Related configs in Cassandra.yaml:
- cdc_enabled: true
- cdc_total_space_in_mb: 4096
- commitlog_segment_size_in_mb: 32mb
- commitlog_total_space_in_mb: 8192
- commitlog_sync: periodic
- commitlog_sync_period_in_ms: 10000
2. Enable CDC on a few tables by CQL:
ALTER TABLE foo WITH cdc=true;
3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
However at the same time, cdc_raw directory is still empty with no commit
log flushed/copied into it at all.
I want to understand why there's no commit log file flushed into
cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
been reached and write suspension has been triggered in Cassandra DB. This
sounds like a bug and currently makes the CDC feature useless.
Thanks so much,
Bingqin Zhou
Re: Cassandra doesn't flush any commit log files into cdc_raw directory
Posted by Bingqin Zhou <bi...@wepay.com>.
Hi Ahmed,
Thank you for your insights! I don't think the write rates are slow for the
tables I enable CDC on, otherwise, the commit log sizes won't go over the
cdc_total_space_in_mb (4096) quickly. I'll try to dig more into what
affects the speed of memtable flushes.
Bingqin Zhou
On Thu, Apr 29, 2021 at 1:11 AM Ahmed Eljami <ah...@gmail.com> wrote:
> Hi Bingqin,
>
> When cdc_raw directory is full, Cassandra rejects new writes on this node
> with the following message in the log:
>
> - Rejecting Mutation containing CDC-enabled table.....
>
> https://github.com/apache/cassandra/blob/cassandra-3.11.9/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java#L125
>
> So, I'm not sorry if WriteTimeoutException is related to cdc feature in
> your case.
>
> For your issue with no commitLog flushed/copied into cdc_raw directory,
> can you try with an explicit nodetool flush and see if commitLog will be
> transferred ? For tables with a "very low" write rates, memtable can take a
> lot of time before be flushed on disk.
>
> Cheers,
> Ahmed
>
>
> Le jeu. 29 avr. 2021 à 00:23, Bingqin Zhou <bi...@wepay.com.invalid> a
> écrit :
>
>> Hi all,
>>
>> We're working on a Kafka connector to capture data changes in Cassandra by
>> processing commit log files in the cdc_raw directory. After we enabled CDC
>> on a few tables, we didn't observe any commit log files getting flushed
>> into cdc_raw directory as expected, but got WriteTimeoutException in
>> Cassandra DB.
>>
>> Here's how we reproduce the issue:
>>
>> 1. Our Cassandra Settings:
>>
>> - Cassandra Version: 3.11.9
>> - Related configs in Cassandra.yaml:
>> - cdc_enabled: true
>> - cdc_total_space_in_mb: 4096
>> - commitlog_segment_size_in_mb: 32mb
>> - commitlog_total_space_in_mb: 8192
>> - commitlog_sync: periodic
>> - commitlog_sync_period_in_ms: 10000
>>
>> 2. Enable CDC on a few tables by CQL:
>> ALTER TABLE foo WITH cdc=true;
>>
>> 3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
>> However at the same time, cdc_raw directory is still empty with no commit
>> log flushed/copied into it at all.
>>
>> I want to understand why there's no commit log file flushed into
>> cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
>> been reached and write suspension has been triggered in Cassandra DB. This
>> sounds like a bug and currently makes the CDC feature useless.
>>
>> Thanks so much,
>> Bingqin Zhou
>>
>
>
> --
> Cordialement;
>
> Ahmed ELJAMI
>
Re: Cassandra doesn't flush any commit log files into cdc_raw directory
Posted by Bingqin Zhou <bi...@wepay.com.INVALID>.
Hi Ahmed,
Thank you for your insights! I don't think the write rates are slow for the
tables I enable CDC on, otherwise, the commit log sizes won't go over the
cdc_total_space_in_mb (4096) quickly. I'll try to dig more into what
affects the speed of memtable flushes.
Bingqin Zhou
On Thu, Apr 29, 2021 at 1:11 AM Ahmed Eljami <ah...@gmail.com> wrote:
> Hi Bingqin,
>
> When cdc_raw directory is full, Cassandra rejects new writes on this node
> with the following message in the log:
>
> - Rejecting Mutation containing CDC-enabled table.....
>
> https://github.com/apache/cassandra/blob/cassandra-3.11.9/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java#L125
>
> So, I'm not sorry if WriteTimeoutException is related to cdc feature in
> your case.
>
> For your issue with no commitLog flushed/copied into cdc_raw directory,
> can you try with an explicit nodetool flush and see if commitLog will be
> transferred ? For tables with a "very low" write rates, memtable can take a
> lot of time before be flushed on disk.
>
> Cheers,
> Ahmed
>
>
> Le jeu. 29 avr. 2021 à 00:23, Bingqin Zhou <bi...@wepay.com.invalid> a
> écrit :
>
>> Hi all,
>>
>> We're working on a Kafka connector to capture data changes in Cassandra by
>> processing commit log files in the cdc_raw directory. After we enabled CDC
>> on a few tables, we didn't observe any commit log files getting flushed
>> into cdc_raw directory as expected, but got WriteTimeoutException in
>> Cassandra DB.
>>
>> Here's how we reproduce the issue:
>>
>> 1. Our Cassandra Settings:
>>
>> - Cassandra Version: 3.11.9
>> - Related configs in Cassandra.yaml:
>> - cdc_enabled: true
>> - cdc_total_space_in_mb: 4096
>> - commitlog_segment_size_in_mb: 32mb
>> - commitlog_total_space_in_mb: 8192
>> - commitlog_sync: periodic
>> - commitlog_sync_period_in_ms: 10000
>>
>> 2. Enable CDC on a few tables by CQL:
>> ALTER TABLE foo WITH cdc=true;
>>
>> 3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
>> However at the same time, cdc_raw directory is still empty with no commit
>> log flushed/copied into it at all.
>>
>> I want to understand why there's no commit log file flushed into
>> cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
>> been reached and write suspension has been triggered in Cassandra DB. This
>> sounds like a bug and currently makes the CDC feature useless.
>>
>> Thanks so much,
>> Bingqin Zhou
>>
>
>
> --
> Cordialement;
>
> Ahmed ELJAMI
>
Re: Cassandra doesn't flush any commit log files into cdc_raw directory
Posted by Ahmed Eljami <ah...@gmail.com>.
Hi Bingqin,
When cdc_raw directory is full, Cassandra rejects new writes on this node
with the following message in the log:
- Rejecting Mutation containing CDC-enabled table.....
https://github.com/apache/cassandra/blob/cassandra-3.11.9/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java#L125
So, I'm not sorry if WriteTimeoutException is related to cdc feature in
your case.
For your issue with no commitLog flushed/copied into cdc_raw directory, can
you try with an explicit nodetool flush and see if commitLog will be
transferred ? For tables with a "very low" write rates, memtable can take a
lot of time before be flushed on disk.
Cheers,
Ahmed
Le jeu. 29 avr. 2021 à 00:23, Bingqin Zhou <bi...@wepay.com.invalid> a
écrit :
> Hi all,
>
> We're working on a Kafka connector to capture data changes in Cassandra by
> processing commit log files in the cdc_raw directory. After we enabled CDC
> on a few tables, we didn't observe any commit log files getting flushed
> into cdc_raw directory as expected, but got WriteTimeoutException in
> Cassandra DB.
>
> Here's how we reproduce the issue:
>
> 1. Our Cassandra Settings:
>
> - Cassandra Version: 3.11.9
> - Related configs in Cassandra.yaml:
> - cdc_enabled: true
> - cdc_total_space_in_mb: 4096
> - commitlog_segment_size_in_mb: 32mb
> - commitlog_total_space_in_mb: 8192
> - commitlog_sync: periodic
> - commitlog_sync_period_in_ms: 10000
>
> 2. Enable CDC on a few tables by CQL:
> ALTER TABLE foo WITH cdc=true;
>
> 3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
> However at the same time, cdc_raw directory is still empty with no commit
> log flushed/copied into it at all.
>
> I want to understand why there's no commit log file flushed into
> cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
> been reached and write suspension has been triggered in Cassandra DB. This
> sounds like a bug and currently makes the CDC feature useless.
>
> Thanks so much,
> Bingqin Zhou
>
--
Cordialement;
Ahmed ELJAMI
Re: Cassandra doesn't flush any commit log files into cdc_raw directory
Posted by Ahmed Eljami <ah...@gmail.com>.
Hi Bingqin,
When cdc_raw directory is full, Cassandra rejects new writes on this node
with the following message in the log:
- Rejecting Mutation containing CDC-enabled table.....
https://github.com/apache/cassandra/blob/cassandra-3.11.9/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerCDC.java#L125
So, I'm not sorry if WriteTimeoutException is related to cdc feature in
your case.
For your issue with no commitLog flushed/copied into cdc_raw directory, can
you try with an explicit nodetool flush and see if commitLog will be
transferred ? For tables with a "very low" write rates, memtable can take a
lot of time before be flushed on disk.
Cheers,
Ahmed
Le jeu. 29 avr. 2021 à 00:23, Bingqin Zhou <bi...@wepay.com.invalid> a
écrit :
> Hi all,
>
> We're working on a Kafka connector to capture data changes in Cassandra by
> processing commit log files in the cdc_raw directory. After we enabled CDC
> on a few tables, we didn't observe any commit log files getting flushed
> into cdc_raw directory as expected, but got WriteTimeoutException in
> Cassandra DB.
>
> Here's how we reproduce the issue:
>
> 1. Our Cassandra Settings:
>
> - Cassandra Version: 3.11.9
> - Related configs in Cassandra.yaml:
> - cdc_enabled: true
> - cdc_total_space_in_mb: 4096
> - commitlog_segment_size_in_mb: 32mb
> - commitlog_total_space_in_mb: 8192
> - commitlog_sync: periodic
> - commitlog_sync_period_in_ms: 10000
>
> 2. Enable CDC on a few tables by CQL:
> ALTER TABLE foo WITH cdc=true;
>
> 3. After a few days, we get *WriteTimeoutException* in Cassandra DB.
> However at the same time, cdc_raw directory is still empty with no commit
> log flushed/copied into it at all.
>
> I want to understand why there's no commit log file flushed into
> cdc_raw directory at all even when the threshold cdc_total_space_in_mb has
> been reached and write suspension has been triggered in Cassandra DB. This
> sounds like a bug and currently makes the CDC feature useless.
>
> Thanks so much,
> Bingqin Zhou
>
--
Cordialement;
Ahmed ELJAMI