You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by vaibhav khedkar <vk...@gmail.com> on 2023/01/14 05:24:18 UTC

Compactions are stuck in 4.0.5 version

Hello All,

We are facing an issue where few of the nodes are not able to complete
compactions.
We tried restarting, scrubbing and even rebuilding an entire node but
nothing seems to work so far.

It's a 10 Region installation with close to 150 nodes.

Datatax support
<https://support.datastax.com/s/article/ERROR-Failure-serializing-partition-key>
suggested rebuilding the node but that did not help. Any help is
appreciated.

Following is the logtrace.

ERROR [CompactionExecutor:50] 2023-01-14 05:12:20,795
CassandraDaemon.java:581 - Exception in thread
Thread[CompactionExecutor:50,1,main]
org.apache.cassandra.db.rows.PartitionSerializationException: Failed to
serialize partition key '<key>'  on table '<table>' in keyspace
'<keyspace>'.
at
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:240)
at
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
at
org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:84)
at
org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:137)
at
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:193)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:77)
at
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:100)
at
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:298)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.BufferOverflowException: null
at
org.apache.cassandra.io.util.DataOutputBuffer.validateReallocation(DataOutputBuffer.java:136)
at
org.apache.cassandra.io.util.DataOutputBuffer.calculateNewSize(DataOutputBuffer.java:154)
at
org.apache.cassandra.io.util.DataOutputBuffer.expandToFit(DataOutputBuffer.java:161)
at
org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:121)
at
org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:121)
at
org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:93)
at
org.apache.cassandra.db.marshal.ByteArrayAccessor.write(ByteArrayAccessor.java:61)
at
org.apache.cassandra.db.marshal.ByteArrayAccessor.write(ByteArrayAccessor.java:38)
at
org.apache.cassandra.db.marshal.ValueAccessor.writeWithVIntLength(ValueAccessor.java:164)
at
org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:451)
at
org.apache.cassandra.db.ClusteringPrefix$Serializer.serializeValuesWithoutSize(ClusteringPrefix.java:397)
at
org.apache.cassandra.db.Clustering$Serializer.serialize(Clustering.java:132)
at
org.apache.cassandra.db.ClusteringPrefix$Serializer.serialize(ClusteringPrefix.java:339)
at
org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:110)
at
org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:91)
at org.apache.cassandra.db.ColumnIndex.addIndexBlock(ColumnIndex.java:223)
at org.apache.cassandra.db.ColumnIndex.add(ColumnIndex.java:271)
at org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:118)
at
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:216)
... 14 common frames omitted

Thanks
vaibhav

Re: Compactions are stuck in 4.0.5 version

Posted by vaibhav khedkar <vk...@gmail.com>.
Thank you so much Scott.

Increasing the value from  64 to 128 fixed the issue for us.

We will certainly look at our data model and understand why the partitions
are growing to such a large value.

Thanks
vaibhav

On Fri, Jan 13, 2023 at 9:45 PM C. Scott Andreas <sc...@paradoxica.net>
wrote:

> Hi Vaibhav, thanks for reaching out.
>
> Based on my understanding of this exception, this may be due to the index
> for this partition exceeding 2GiB (which is *extremely* large for a
> partition index component).
>
> Reducing the size of the column index below 2GiB may resolve this issue.
> You may be able to do so by increasing the size of
> "column_index_size_in_kb" in cassandra.yaml (which defines the granularity
> of this index). This may enable the compaction to complete - e.g., by
> swapping the default of "64" with "128".
>
> Docs for this param are in cassandra.yaml:
> # Granularity of the collation index of rows within a partition.
> # Increase if your rows are large, or if you have a very large
> # number of rows per partition.  The competing goals are these:
> #
> # - a smaller granularity means more index entries are generated
> #   and looking up rows withing the partition by collation column
> #   is faster
> # - but, Cassandra will keep the collation index in memory for hot
> #   rows (as part of the key cache), so a larger granularity means
> #   you can cache more hot rows
>
> However, the root of this issue is likely a partition whose size has
> become extraordinary. Would recommend determining what has resulted in an
> individual partition growing so large that its index has exceeded 2GiB and
> determining if it can be removed or if the data model for the table can be
> adjusted to avoid such a large number of rows being stored within one
> partition.
>
> – Scott
>
> On Jan 13, 2023, at 9:24 PM, vaibhav khedkar <vk...@gmail.com> wrote:
>
>
> Hello All,
>
> We are facing an issue where few of the nodes are not able to complete
> compactions.
> We tried restarting, scrubbing and even rebuilding an entire node but
> nothing seems to work so far.
>
> It's a 10 Region installation with close to 150 nodes.
>
> Datatax support
> <https://support.datastax.com/s/article/ERROR-Failure-serializing-partition-key>
> suggested rebuilding the node but that did not help. Any help is
> appreciated.
>
> Following is the logtrace.
>
> ERROR [CompactionExecutor:50] 2023-01-14 05:12:20,795
> CassandraDaemon.java:581 - Exception in thread
> Thread[CompactionExecutor:50,1,main]
> org.apache.cassandra.db.rows.PartitionSerializationException: Failed to
> serialize partition key '<key>'  on table '<table>' in keyspace
> '<keyspace>'.
> at
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:240)
> at
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125)
> at
> org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:84)
> at
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:137)
> at
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:193)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:77)
> at
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:100)
> at
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:298)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.nio.BufferOverflowException: null
> at
> org.apache.cassandra.io.util.DataOutputBuffer.validateReallocation(DataOutputBuffer.java:136)
> at
> org.apache.cassandra.io.util.DataOutputBuffer.calculateNewSize(DataOutputBuffer.java:154)
> at
> org.apache.cassandra.io.util.DataOutputBuffer.expandToFit(DataOutputBuffer.java:161)
> at
> org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:121)
> at
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:121)
> at
> org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:93)
> at
> org.apache.cassandra.db.marshal.ByteArrayAccessor.write(ByteArrayAccessor.java:61)
> at
> org.apache.cassandra.db.marshal.ByteArrayAccessor.write(ByteArrayAccessor.java:38)
> at
> org.apache.cassandra.db.marshal.ValueAccessor.writeWithVIntLength(ValueAccessor.java:164)
> at
> org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:451)
> at
> org.apache.cassandra.db.ClusteringPrefix$Serializer.serializeValuesWithoutSize(ClusteringPrefix.java:397)
> at
> org.apache.cassandra.db.Clustering$Serializer.serialize(Clustering.java:132)
> at
> org.apache.cassandra.db.ClusteringPrefix$Serializer.serialize(ClusteringPrefix.java:339)
> at
> org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:110)
> at
> org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:91)
> at org.apache.cassandra.db.ColumnIndex.addIndexBlock(ColumnIndex.java:223)
> at org.apache.cassandra.db.ColumnIndex.add(ColumnIndex.java:271)
> at org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:118)
> at
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:216)
> ... 14 common frames omitted
>
> Thanks
> vaibhav
>
>
>
>
>

-- 
Best Regards,
Vaibhav Khedkar
Google Inc.
Mobile : (806)- 252 - 2912

Email: vkhedkar7@gmail.com

Re: Compactions are stuck in 4.0.5 version

Posted by "C. Scott Andreas" <sc...@paradoxica.net>.
Hi Vaibhav, thanks for reaching out.Based on my understanding of this exception, this may be due to the index for this partition exceeding 2GiB (which is *extremely* large for a partition index component).Reducing the size of the column index below 2GiB may resolve this issue. You may be able to do so by increasing the size of "column_index_size_in_kb" in cassandra.yaml (which defines the granularity of this index). This may enable the compaction to complete - e.g., by swapping the default of "64" with "128".Docs for this param are in cassandra.yaml:# Granularity of the collation index of rows within a partition.# Increase if your rows are large, or if you have a very large# number of rows per partition.  The competing goals are these:## - a smaller granularity means more index entries are generated#   and looking up rows withing the partition by collation column#   is faster# - but, Cassandra will keep the collation index in memory for hot#   rows (as part of the key cache), so a larger granularity means#   you can cache more hot rowsHowever, the root of this issue is likely a partition whose size has become extraordinary. Would recommend determining what has resulted in an individual partition growing so large that its index has exceeded 2GiB and determining if it can be removed or if the data model for the table can be adjusted to avoid such a large number of rows being stored within one partition.– ScottOn Jan 13, 2023, at 9:24 PM, vaibhav khedkar <vk...@gmail.com> wrote:Hello All, We are facing an issue where few of the nodes are not able to complete compactions. We tried restarting, scrubbing and even rebuilding an entire node but nothing seems to work so far. It's a 10 Region installation with close to 150 nodes. Datatax support suggested rebuilding the node but that did not help. Any help is appreciated. Following is the logtrace. ERROR [CompactionExecutor:50] 2023-01-14 05:12:20,795 CassandraDaemon.java:581 - Exception in thread Thread[CompactionExecutor:50,1,main]org.apache.cassandra.db.rows.PartitionSerializationException: Failed to serialize partition key '<key>'  on table '<table>' in keyspace '<keyspace>'. at org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:240) at org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:125) at org.apache.cassandra.db.compaction.writers.MaxSSTableSizeWriter.realAppend(MaxSSTableSizeWriter.java:84) at org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:137) at org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:193) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:77) at org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:100) at org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:298) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748)Caused by: java.nio.BufferOverflowException: null at org.apache.cassandra.io.util.DataOutputBuffer.validateReallocation(DataOutputBuffer.java:136) at org.apache.cassandra.io.util.DataOutputBuffer.calculateNewSize(DataOutputBuffer.java:154) at org.apache.cassandra.io.util.DataOutputBuffer.expandToFit(DataOutputBuffer.java:161) at org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:121) at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:121) at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:93) at org.apache.cassandra.db.marshal.ByteArrayAccessor.write(ByteArrayAccessor.java:61) at org.apache.cassandra.db.marshal.ByteArrayAccessor.write(ByteArrayAccessor.java:38) at org.apache.cassandra.db.marshal.ValueAccessor.writeWithVIntLength(ValueAccessor.java:164) at org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:451) at org.apache.cassandra.db.ClusteringPrefix$Serializer.serializeValuesWithoutSize(ClusteringPrefix.java:397) at org.apache.cassandra.db.Clustering$Serializer.serialize(Clustering.java:132) at org.apache.cassandra.db.ClusteringPrefix$Serializer.serialize(ClusteringPrefix.java:339) at org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:110) at org.apache.cassandra.io.sstable.IndexInfo$Serializer.serialize(IndexInfo.java:91) at org.apache.cassandra.db.ColumnIndex.addIndexBlock(ColumnIndex.java:223) at org.apache.cassandra.db.ColumnIndex.add(ColumnIndex.java:271) at org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:118) at org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:216) ... 14 common frames omittedThanksvaibhav