You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Joe Obernberger <jo...@gmail.com> on 2020/12/02 15:55:23 UTC
Digest mismatch
Hi All - this is my first post here. I've been using Cassandra for
several months now and am loving it. We are moving from Apache HBase to
Cassandra for a big data analytics platform.
I'm using java to get rows from Cassandra and very frequently get a
java.util.NoSuchElementException when iterating through a ResultSet. If
I retry this query again (often several times), it works. The debug log
on the Cassandra nodes show this message:
org.apache.cassandra.service.DigestMismatchException: Mismatch for key
DecoratedKey
My cluster looks like this:
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host
ID Rack
UN 172.16.100.224 340.5 GiB 512 50.9%
8ba646ac-2b33-49de-a220-ae9842f18806 rack1
UN 172.16.100.208 269.19 GiB 384 40.3%
4e0ba42f-649b-425a-857a-34497eb3036e rack1
UN 172.16.100.225 282.83 GiB 512 50.4%
247f3d70-d13b-4d68-9a53-2ed58e01a63e rack1
UN 172.16.110.3 409.78 GiB 768 63.2%
0abea102-06d2-4309-af36-a3163e8f00d8 rack1
UN 172.16.110.4 330.15 GiB 512 50.6%
2a5ae735-6304-4e99-924b-44d9d5ec86b7 rack1
UN 172.16.100.253 98.88 GiB 128 14.6%
6b528b0b-d7f7-4378-bba8-1857802d4f18 rack1
UN 172.16.100.254 204.5 GiB 256 30.0%
87d0cb48-a57d-460e-bd82-93e6e52e93ea rack1
I suspect this has to do with how I'm using consistency levels?
Typically I'm using ONE. I just set the dclocal_read_repair_chance to
0.0, but I'm still seeing the issue. Any help/tips?
Thank you!
-Joe Obernberger
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
For additional commands, e-mail: user-help@cassandra.apache.org
Re: Digest mismatch
Posted by Joe Obernberger <jo...@gmail.com>.
Python eh? What's that? Kidding. (Java guy over here...)
I grepped the logs for mutations but only see messages like:
2020-09-14 16:15:19,963 CommitLog.java:149 - Log replay complete, 0
replayed mutations
and
2020-09-17 16:22:13,020 CommitLog.java:149 - Log replay complete, 291708
replayed mutations
Typically, we read very soon after the write, which I thought was a
problem also; however at this point it's been 24+ hours since the data
has been written that I'm now trying to read. Happens very easily.
By determining the partition key, how will that help?
-Joe
On 12/2/2020 12:16 PM, Steve Lacerda wrote:
> The digest mismatch typically shows the partition key info, with
> something like this:
>
> DecoratedKey(-1671292413668442751, 48343732322d3838353032)
>
> That refers to the partition key, which you can gather like so:
>
> python
> import binascii
> binascii.unhexlify('48343732322d3838353032')
> 'H4722-88502'
>
> My assumption is that since you are reading and writing with one, that
> some nodes have the data and others don't. Are you seeing any dropped
> mutations in the logs? How long after the write are you attempting to
> read the same data?
>
>
>
>
>
>
> On Wed, Dec 2, 2020 at 9:12 AM Joe Obernberger
> <joseph.obernberger@gmail.com <ma...@gmail.com>>
> wrote:
>
> Hi Carl - thank you for replying.
> I am using Cassandra 3.11.9-1
>
> Rows are not typically being deleted - I assume you're referring
> to Tombstones. I don't think that should be the case here as I
> don't think we've deleted anything here.
> This is a test cluster and some of the machines are small (hence
> the one node with 128 tokens and 14.6% - it has a lot less disk
> space than the other nodes). This is one of the features that I
> really like with Cassandra - being able to size nodes based on
> disk/CPU/RAM.
>
> All data is currently written with ONE. All data is read with
> ONE. I can replicate this issue at will, so can try different
> things easily. I tried changing the read process to use QUORUM
> and the issue still takes place. Right now I'm running a 'nodetool
> repair' to see if that helps. Our largest table 'doc' has the
> following stats:
>
> Table: doc
> SSTable count: 28
> Space used (live): 113609995010
> Space used (total): 113609995010
> Space used by snapshots (total): 0
> Off heap memory used (total): 225006197
> SSTable Compression Ratio: 0.37730474570644196
> Number of partitions (estimate): 93641747
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 3712
> Local read count: 891065091
> Local read latency: NaN ms
> Local write count: 7448281135
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 0.0
> Bloom filter false positives: 988
> Bloom filter false ratio: 0.00001
> Bloom filter space used: 151149880
> Bloom filter off heap memory used: 151149656
> Index summary off heap memory used: 38654701
> Compression metadata off heap memory used: 35201840
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 3379391
> Compacted partition mean bytes: 3389
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 8174438
>
> Thoughts/ideas? Thank you!
>
> -Joe
>
> On 12/2/2020 11:49 AM, Carl Mueller wrote:
>> Why is one of your nodes only at 14.6% ownership? That's weird,
>> unless you have a small rowcount.
>>
>> Are you frequently deleting rows? Are you frequently writing rows
>> at ONE?
>>
>> What version of cassandra?
>>
>>
>>
>> On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger
>> <joseph.obernberger@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>> Hi All - this is my first post here. I've been using
>> Cassandra for
>> several months now and am loving it. We are moving from
>> Apache HBase to
>> Cassandra for a big data analytics platform.
>>
>> I'm using java to get rows from Cassandra and very frequently
>> get a
>> java.util.NoSuchElementException when iterating through a
>> ResultSet. If
>> I retry this query again (often several times), it works.
>> The debug log
>> on the Cassandra nodes show this message:
>> org.apache.cassandra.service.DigestMismatchException:
>> Mismatch for key
>> DecoratedKey
>>
>> My cluster looks like this:
>>
>> Datacenter: datacenter1
>> =======================
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> -- Address Load Tokens Owns (effective)
>> Host
>> ID Rack
>> UN 172.16.100.224 340.5 GiB 512 50.9%
>> 8ba646ac-2b33-49de-a220-ae9842f18806 rack1
>> UN 172.16.100.208 269.19 GiB 384 40.3%
>> 4e0ba42f-649b-425a-857a-34497eb3036e rack1
>> UN 172.16.100.225 282.83 GiB 512 50.4%
>> 247f3d70-d13b-4d68-9a53-2ed58e01a63e rack1
>> UN 172.16.110.3 409.78 GiB 768 63.2%
>> 0abea102-06d2-4309-af36-a3163e8f00d8 rack1
>> UN 172.16.110.4 330.15 GiB 512 50.6%
>> 2a5ae735-6304-4e99-924b-44d9d5ec86b7 rack1
>> UN 172.16.100.253 98.88 GiB 128 14.6%
>> 6b528b0b-d7f7-4378-bba8-1857802d4f18 rack1
>> UN 172.16.100.254 204.5 GiB 256 30.0%
>> 87d0cb48-a57d-460e-bd82-93e6e52e93ea rack1
>>
>> I suspect this has to do with how I'm using consistency levels?
>> Typically I'm using ONE. I just set the
>> dclocal_read_repair_chance to
>> 0.0, but I'm still seeing the issue. Any help/tips?
>>
>> Thank you!
>>
>> -Joe Obernberger
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> <ma...@cassandra.apache.org>
>> For additional commands, e-mail:
>> user-help@cassandra.apache.org
>> <ma...@cassandra.apache.org>
>>
>>
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.avg.com_email-2Dsignature-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dlink-26utm-5Fcampaign-3Dsig-2Demail-26utm-5Fcontent-3Demailclient&d=DwMDaQ&c=adz96Xi0w1RHqtPMowiL2g&r=R58SsZ6FLB8iCRFGJzNOH0d2HRPVtaWKKj5fzuMiGlo&m=ILue9yYCY9fLwFNLsm3-mIbyPh6ehPGUPwbWBgqtxe4&s=FmyWQpErRafN779unRB23GMeoiN49uZNIZvPDcD1iVs&e=>
>> Virus-free. www.avg.com
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.avg.com_email-2Dsignature-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dlink-26utm-5Fcampaign-3Dsig-2Demail-26utm-5Fcontent-3Demailclient&d=DwMDaQ&c=adz96Xi0w1RHqtPMowiL2g&r=R58SsZ6FLB8iCRFGJzNOH0d2HRPVtaWKKj5fzuMiGlo&m=ILue9yYCY9fLwFNLsm3-mIbyPh6ehPGUPwbWBgqtxe4&s=FmyWQpErRafN779unRB23GMeoiN49uZNIZvPDcD1iVs&e=>
>>
>>
>> <#m_-330889386023926826_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
>
>
> --
> Steve Lacerda
> e. steve.lacerda@datastax.com <ma...@datastax.com>
> w. www.datastax.com <http://www.datastax.com>
>
Re: Digest mismatch
Posted by Steve Lacerda <st...@datastax.com>.
The digest mismatch typically shows the partition key info, with something
like this:
DecoratedKey(-1671292413668442751, 48343732322d3838353032)
That refers to the partition key, which you can gather like so:
python
import binascii
binascii.unhexlify('48343732322d3838353032')
'H4722-88502'
My assumption is that since you are reading and writing with one, that some
nodes have the data and others don't. Are you seeing any dropped mutations
in the logs? How long after the write are you attempting to read the same
data?
On Wed, Dec 2, 2020 at 9:12 AM Joe Obernberger <jo...@gmail.com>
wrote:
> Hi Carl - thank you for replying.
> I am using Cassandra 3.11.9-1
>
> Rows are not typically being deleted - I assume you're referring to
> Tombstones. I don't think that should be the case here as I don't think
> we've deleted anything here.
> This is a test cluster and some of the machines are small (hence the one
> node with 128 tokens and 14.6% - it has a lot less disk space than the
> other nodes). This is one of the features that I really like with
> Cassandra - being able to size nodes based on disk/CPU/RAM.
>
> All data is currently written with ONE. All data is read with ONE. I can
> replicate this issue at will, so can try different things easily. I tried
> changing the read process to use QUORUM and the issue still takes place.
> Right now I'm running a 'nodetool repair' to see if that helps. Our
> largest table 'doc' has the following stats:
>
> Table: doc
> SSTable count: 28
> Space used (live): 113609995010
> Space used (total): 113609995010
> Space used by snapshots (total): 0
> Off heap memory used (total): 225006197
> SSTable Compression Ratio: 0.37730474570644196
> Number of partitions (estimate): 93641747
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 3712
> Local read count: 891065091
> Local read latency: NaN ms
> Local write count: 7448281135
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 0.0
> Bloom filter false positives: 988
> Bloom filter false ratio: 0.00001
> Bloom filter space used: 151149880
> Bloom filter off heap memory used: 151149656
> Index summary off heap memory used: 38654701
> Compression metadata off heap memory used: 35201840
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 3379391
> Compacted partition mean bytes: 3389
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 8174438
>
> Thoughts/ideas? Thank you!
>
> -Joe
> On 12/2/2020 11:49 AM, Carl Mueller wrote:
>
> Why is one of your nodes only at 14.6% ownership? That's weird, unless you
> have a small rowcount.
>
> Are you frequently deleting rows? Are you frequently writing rows at ONE?
>
> What version of cassandra?
>
>
>
> On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger <
> joseph.obernberger@gmail.com> wrote:
>
>> Hi All - this is my first post here. I've been using Cassandra for
>> several months now and am loving it. We are moving from Apache HBase to
>> Cassandra for a big data analytics platform.
>>
>> I'm using java to get rows from Cassandra and very frequently get a
>> java.util.NoSuchElementException when iterating through a ResultSet. If
>> I retry this query again (often several times), it works. The debug log
>> on the Cassandra nodes show this message:
>> org.apache.cassandra.service.DigestMismatchException: Mismatch for key
>> DecoratedKey
>>
>> My cluster looks like this:
>>
>> Datacenter: datacenter1
>> =======================
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> -- Address Load Tokens Owns (effective) Host
>> ID Rack
>> UN 172.16.100.224 340.5 GiB 512 50.9%
>> 8ba646ac-2b33-49de-a220-ae9842f18806 rack1
>> UN 172.16.100.208 269.19 GiB 384 40.3%
>> 4e0ba42f-649b-425a-857a-34497eb3036e rack1
>> UN 172.16.100.225 282.83 GiB 512 50.4%
>> 247f3d70-d13b-4d68-9a53-2ed58e01a63e rack1
>> UN 172.16.110.3 409.78 GiB 768 63.2%
>> 0abea102-06d2-4309-af36-a3163e8f00d8 rack1
>> UN 172.16.110.4 330.15 GiB 512 50.6%
>> 2a5ae735-6304-4e99-924b-44d9d5ec86b7 rack1
>> UN 172.16.100.253 98.88 GiB 128 14.6%
>> 6b528b0b-d7f7-4378-bba8-1857802d4f18 rack1
>> UN 172.16.100.254 204.5 GiB 256 30.0%
>> 87d0cb48-a57d-460e-bd82-93e6e52e93ea rack1
>>
>> I suspect this has to do with how I'm using consistency levels?
>> Typically I'm using ONE. I just set the dclocal_read_repair_chance to
>> 0.0, but I'm still seeing the issue. Any help/tips?
>>
>> Thank you!
>>
>> -Joe Obernberger
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>>
>
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.avg.com_email-2Dsignature-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dlink-26utm-5Fcampaign-3Dsig-2Demail-26utm-5Fcontent-3Demailclient&d=DwMDaQ&c=adz96Xi0w1RHqtPMowiL2g&r=R58SsZ6FLB8iCRFGJzNOH0d2HRPVtaWKKj5fzuMiGlo&m=ILue9yYCY9fLwFNLsm3-mIbyPh6ehPGUPwbWBgqtxe4&s=FmyWQpErRafN779unRB23GMeoiN49uZNIZvPDcD1iVs&e=> Virus-free.
> www.avg.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.avg.com_email-2Dsignature-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dlink-26utm-5Fcampaign-3Dsig-2Demail-26utm-5Fcontent-3Demailclient&d=DwMDaQ&c=adz96Xi0w1RHqtPMowiL2g&r=R58SsZ6FLB8iCRFGJzNOH0d2HRPVtaWKKj5fzuMiGlo&m=ILue9yYCY9fLwFNLsm3-mIbyPh6ehPGUPwbWBgqtxe4&s=FmyWQpErRafN779unRB23GMeoiN49uZNIZvPDcD1iVs&e=>
> <#m_-330889386023926826_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
>
--
Steve Lacerda
e. steve.lacerda@datastax.com
w. www.datastax.com
Re: Digest mismatch
Posted by Joe Obernberger <jo...@gmail.com>.
Thank you. OK - I can see from 'nodetool getendpoints keyspace table
key' that 3 nodes respond as one would expect. My theory is that once I
encounter the error, a read repair is triggered, and by the time I
execute nodetool, 3 nodes respond.
I tried a test with the same table, but with LOCAL_QUORUM on reads and
writes of new data, and it works. Thank you all for that! If I don't
care which version of the data is returned, then I should be able to use
ONE on reads, if LOCAL_QUORUM was used on writes - yes?
-Joe
On 12/3/2020 12:49 AM, Erick Ramirez wrote:
>
> Thank you Steve - once I have the key, how do I get to a node?
>
> Run this command to determine which replicas own the partition:
>
> $ nodetool getendpoints <partition_key>
>
> So if the propagation has not taken place and a node doesn't have
> the data and is the first to 'be asked' the client will get no data?
>
> That's correct. It will not return data it doesn't have when querying
> with a consistency of ONE. There are limited cases where ONE is
> applicable. In most cases, a strong consistency of LOCAL_QUORUM is
> recommended to avoid the scenario you described. Cheers!
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> Virus-free. www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
Re: Digest mismatch
Posted by Joe Obernberger <jo...@gmail.com>.
Some more info.
From java using the Datastax 4.9.0 driver, I'm selecting an entire
table, after about 17 million rows (the table is probably around 150
million rows), I get:
com.datastax.oss.driver.api.core.servererrors.ReadFailureException:
Cassandra failure during read query at consistency ONE (1 responses were
required but only 0 replica responded, 1 failed)
It's almost as if the data was not written with LOCAL_QUORUM, but I've
triple checked.
If I stop writes to the table and reduce the load on Cassandra, then it
(java program) works OK. Presto queries still fail, but that might be a
Presto issue. Interestingly they sometimes fail quickly, coming back
with the 'Cassandra failure during read query' error very quickly, but
sometimes go through 140 million rows and then die.
Are regular table repairs required to be run when using LOCAL_QUORUM? I
see no nodes down, or disk failures.
-Joe
On 12/14/2020 9:41 AM, Joe Obernberger wrote:
>
> Thanks all for the help on this. I've changed all my writes to
> LOCAL_QUORUM, and same with reads. Under a constant load of doing
> writes to a table and reads from the same table, I'm still getting the:
>
> DEBUG [ReadRepairStage:372] 2020-12-14 09:36:09,002
> ReadCallback.java:244 - Digest mismatch:
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key
> DecoratedKey(-7287062361589376757,
> 44535f313034335f333332353839305f323032302d31322d31325430302d31392d33312e3330335a)
> (054250ecd7170b1707ec36c6f1798ed0 vs 5752eec36bff050dd363b7803c500a95)
> at
> org.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:92)
> ~[apache-cassandra-3.11.9.jar:3.11.9]
> at
> org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:235)
> ~[apache-cassandra-3.11.9.jar:3.11.9]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [na:1.8.0_272]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [na:1.8.0_272]
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
> [apache-cassandra-3.11.9.jar:3.11.9]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_272]
>
> Under load this happens a lot; several times a second on each of the
> server nodes. I started with a new table and under light load, it
> worked wonderfully - no issues. But under heavy load, it still
> occurs. Is there a different setting?
> Also, when this happens, I cannot query the table from presto as I
> then get the familiar:
>
> "Query 20201214_143949_00000_b3fnt failed: Cassandra timeout during
> read query at consistency LOCAL_QUORUM (2 responses were required but
> only 1 replica responded)"
>
> Changed presto to use ONE results in an error about 1 were required,
> but only 1 responded.
>
> Any ideas? Things to try? Thanks!
>
> -Joe
>
> On 12/3/2020 12:49 AM, Erick Ramirez wrote:
>>
>> Thank you Steve - once I have the key, how do I get to a node?
>>
>> Run this command to determine which replicas own the partition:
>>
>> $ nodetool getendpoints <partition_key>
>>
>> So if the propagation has not taken place and a node doesn't have
>> the data and is the first to 'be asked' the client will get no data?
>>
>> That's correct. It will not return data it doesn't have when querying
>> with a consistency of ONE. There are limited cases where ONE is
>> applicable. In most cases, a strong consistency of LOCAL_QUORUM is
>> recommended to avoid the scenario you described. Cheers!
>>
>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>> Virus-free. www.avg.com
>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>>
>>
>> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
Re: Digest mismatch
Posted by Joe Obernberger <jo...@gmail.com>.
Thanks all for the help on this. I've changed all my writes to
LOCAL_QUORUM, and same with reads. Under a constant load of doing
writes to a table and reads from the same table, I'm still getting the:
DEBUG [ReadRepairStage:372] 2020-12-14 09:36:09,002
ReadCallback.java:244 - Digest mismatch:
org.apache.cassandra.service.DigestMismatchException: Mismatch for key
DecoratedKey(-7287062361589376757,
44535f313034335f333332353839305f323032302d31322d31325430302d31392d33312e3330335a)
(054250ecd7170b1707ec36c6f1798ed0 vs 5752eec36bff050dd363b7803c500a95)
at
org.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:92)
~[apache-cassandra-3.11.9.jar:3.11.9]
at
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:235)
~[apache-cassandra-3.11.9.jar:3.11.9]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[na:1.8.0_272]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[na:1.8.0_272]
at
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
[apache-cassandra-3.11.9.jar:3.11.9]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_272]
Under load this happens a lot; several times a second on each of the
server nodes. I started with a new table and under light load, it
worked wonderfully - no issues. But under heavy load, it still occurs.
Is there a different setting?
Also, when this happens, I cannot query the table from presto as I then
get the familiar:
"Query 20201214_143949_00000_b3fnt failed: Cassandra timeout during read
query at consistency LOCAL_QUORUM (2 responses were required but only 1
replica responded)"
Changed presto to use ONE results in an error about 1 were required, but
only 1 responded.
Any ideas? Things to try? Thanks!
-Joe
On 12/3/2020 12:49 AM, Erick Ramirez wrote:
>
> Thank you Steve - once I have the key, how do I get to a node?
>
> Run this command to determine which replicas own the partition:
>
> $ nodetool getendpoints <partition_key>
>
> So if the propagation has not taken place and a node doesn't have
> the data and is the first to 'be asked' the client will get no data?
>
> That's correct. It will not return data it doesn't have when querying
> with a consistency of ONE. There are limited cases where ONE is
> applicable. In most cases, a strong consistency of LOCAL_QUORUM is
> recommended to avoid the scenario you described. Cheers!
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> Virus-free. www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
Re: Digest mismatch
Posted by Erick Ramirez <er...@datastax.com>.
>
> Thank you Steve - once I have the key, how do I get to a node?
>
Run this command to determine which replicas own the partition:
$ nodetool getendpoints <partition_key>
> So if the propagation has not taken place and a node doesn't have the data
> and is the first to 'be asked' the client will get no data?
>
That's correct. It will not return data it doesn't have when querying with
a consistency of ONE. There are limited cases where ONE is applicable. In
most cases, a strong consistency of LOCAL_QUORUM is recommended to avoid
the scenario you described. Cheers!
Re: Digest mismatch
Posted by Joe Obernberger <jo...@gmail.com>.
Thank you Steve - once I have the key, how do I get to a node?
After reading some of the documentation, it looks like the
load-balancing-policy below *is* a token aware policy. Perhaps writes
need to be done with QUORUM; I don't know how long Cassandra will take
to make sure replicas are consistent when doing ONE for all writes. So
if the propagation has not taken place and a node doesn't have the data
and is the first to 'be asked' the client will get no data?
-Joe
On 12/2/2020 2:09 PM, Steve Lacerda wrote:
> If you can determine the key, then you can determine which nodes do
> and do not have the data. You may be able to glean a bit more
> information like that, maybe one node is having problems, versus
> entire cluster.
>
> On Wed, Dec 2, 2020 at 9:32 AM Joe Obernberger
> <joseph.obernberger@gmail.com <ma...@gmail.com>>
> wrote:
>
> Clients are using an application.conf like:
>
> datastax-java-driver {
> basic.request.timeout = 60 seconds
> basic.request.consistency = ONE
> basic.contact-points = ["172.16.110.3:9042
> <http://172.16.110.3:9042>", "172.16.110.4:9042
> <http://172.16.110.4:9042>", "172.16.100.208:9042
> <http://172.16.100.208:9042>", "172.16.100.224:9042
> <http://172.16.100.224:9042>", "172.16.100.225:9042
> <http://172.16.100.225:9042>", "172.16.100.253:9042
> <http://172.16.100.253:9042>", "172.16.100.254:9042
> <http://172.16.100.254:9042>"]
> basic.load-balancing-policy {
> local-datacenter = datacenter1
> }
> }
>
> So no, I'm not using a token aware policy. I'm googling that
> now...cuz I don't know what it is!
>
> -Joe
>
> On 12/2/2020 12:18 PM, Carl Mueller wrote:
>> Are you using token aware policy for the driver?
>>
>> If your writes are one and your reads are one, the propagation
>> may not have happened depending on the coordinator that is used.
>>
>> TokenAware will make that a bit better.
>>
>> On Wed, Dec 2, 2020 at 11:12 AM Joe Obernberger
>> <joseph.obernberger@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>> Hi Carl - thank you for replying.
>> I am using Cassandra 3.11.9-1
>>
>> Rows are not typically being deleted - I assume you're
>> referring to Tombstones. I don't think that should be the
>> case here as I don't think we've deleted anything here.
>> This is a test cluster and some of the machines are small
>> (hence the one node with 128 tokens and 14.6% - it has a lot
>> less disk space than the other nodes). This is one of the
>> features that I really like with Cassandra - being able to
>> size nodes based on disk/CPU/RAM.
>>
>> All data is currently written with ONE. All data is read
>> with ONE. I can replicate this issue at will, so can try
>> different things easily. I tried changing the read process
>> to use QUORUM and the issue still takes place. Right now I'm
>> running a 'nodetool repair' to see if that helps. Our
>> largest table 'doc' has the following stats:
>>
>> Table: doc
>> SSTable count: 28
>> Space used (live): 113609995010
>> Space used (total): 113609995010
>> Space used by snapshots (total): 0
>> Off heap memory used (total): 225006197
>> SSTable Compression Ratio: 0.37730474570644196
>> Number of partitions (estimate): 93641747
>> Memtable cell count: 0
>> Memtable data size: 0
>> Memtable off heap memory used: 0
>> Memtable switch count: 3712
>> Local read count: 891065091
>> Local read latency: NaN ms
>> Local write count: 7448281135
>> Local write latency: NaN ms
>> Pending flushes: 0
>> Percent repaired: 0.0
>> Bloom filter false positives: 988
>> Bloom filter false ratio: 0.00001
>> Bloom filter space used: 151149880
>> Bloom filter off heap memory used: 151149656
>> Index summary off heap memory used: 38654701
>> Compression metadata off heap memory used: 35201840
>> Compacted partition minimum bytes: 104
>> Compacted partition maximum bytes: 3379391
>> Compacted partition mean bytes: 3389
>> Average live cells per slice (last five minutes): NaN
>> Maximum live cells per slice (last five minutes): 0
>> Average tombstones per slice (last five minutes): NaN
>> Maximum tombstones per slice (last five minutes): 0
>> Dropped Mutations: 8174438
>>
>> Thoughts/ideas? Thank you!
>>
>> -Joe
>>
>> On 12/2/2020 11:49 AM, Carl Mueller wrote:
>>> Why is one of your nodes only at 14.6% ownership? That's
>>> weird, unless you have a small rowcount.
>>>
>>> Are you frequently deleting rows? Are you frequently writing
>>> rows at ONE?
>>>
>>> What version of cassandra?
>>>
>>>
>>>
>>> On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger
>>> <joseph.obernberger@gmail.com
>>> <ma...@gmail.com>> wrote:
>>>
>>> Hi All - this is my first post here. I've been using
>>> Cassandra for
>>> several months now and am loving it. We are moving from
>>> Apache HBase to
>>> Cassandra for a big data analytics platform.
>>>
>>> I'm using java to get rows from Cassandra and very
>>> frequently get a
>>> java.util.NoSuchElementException when iterating through
>>> a ResultSet. If
>>> I retry this query again (often several times), it
>>> works. The debug log
>>> on the Cassandra nodes show this message:
>>> org.apache.cassandra.service.DigestMismatchException:
>>> Mismatch for key
>>> DecoratedKey
>>>
>>> My cluster looks like this:
>>>
>>> Datacenter: datacenter1
>>> =======================
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> -- Address Load Tokens Owns (effective)
>>> Host
>>> ID Rack
>>> UN 172.16.100.224 340.5 GiB 512 50.9%
>>> 8ba646ac-2b33-49de-a220-ae9842f18806 rack1
>>> UN 172.16.100.208 269.19 GiB 384 40.3%
>>> 4e0ba42f-649b-425a-857a-34497eb3036e rack1
>>> UN 172.16.100.225 282.83 GiB 512 50.4%
>>> 247f3d70-d13b-4d68-9a53-2ed58e01a63e rack1
>>> UN 172.16.110.3 409.78 GiB 768 63.2%
>>> 0abea102-06d2-4309-af36-a3163e8f00d8 rack1
>>> UN 172.16.110.4 330.15 GiB 512 50.6%
>>> 2a5ae735-6304-4e99-924b-44d9d5ec86b7 rack1
>>> UN 172.16.100.253 98.88 GiB 128 14.6%
>>> 6b528b0b-d7f7-4378-bba8-1857802d4f18 rack1
>>> UN 172.16.100.254 204.5 GiB 256 30.0%
>>> 87d0cb48-a57d-460e-bd82-93e6e52e93ea rack1
>>>
>>> I suspect this has to do with how I'm using consistency
>>> levels?
>>> Typically I'm using ONE. I just set the
>>> dclocal_read_repair_chance to
>>> 0.0, but I'm still seeing the issue. Any help/tips?
>>>
>>> Thank you!
>>>
>>> -Joe Obernberger
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> user-unsubscribe@cassandra.apache.org
>>> <ma...@cassandra.apache.org>
>>> For additional commands, e-mail:
>>> user-help@cassandra.apache.org
>>> <ma...@cassandra.apache.org>
>>>
>>>
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.avg.com_email-2Dsignature-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dlink-26utm-5Fcampaign-3Dsig-2Demail-26utm-5Fcontent-3Demailclient&d=DwMDaQ&c=adz96Xi0w1RHqtPMowiL2g&r=R58SsZ6FLB8iCRFGJzNOH0d2HRPVtaWKKj5fzuMiGlo&m=Fyv9e8-h-x9SK5jhrHZ0E8GuNrgtlMqrzqMWJPRf6dc&s=LpYSwEkia1rRuqN2D9BD7zWhq-f4KX3JgbvVN3yEeDI&e=>
>>> Virus-free. www.avg.com
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.avg.com_email-2Dsignature-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dlink-26utm-5Fcampaign-3Dsig-2Demail-26utm-5Fcontent-3Demailclient&d=DwMDaQ&c=adz96Xi0w1RHqtPMowiL2g&r=R58SsZ6FLB8iCRFGJzNOH0d2HRPVtaWKKj5fzuMiGlo&m=Fyv9e8-h-x9SK5jhrHZ0E8GuNrgtlMqrzqMWJPRf6dc&s=LpYSwEkia1rRuqN2D9BD7zWhq-f4KX3JgbvVN3yEeDI&e=>
>>>
>>>
>>> <#m_1355843925756209451_m_1378452758220018548_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>
>
>
> --
> Steve Lacerda
> e. steve.lacerda@datastax.com <ma...@datastax.com>
> w. www.datastax.com <http://www.datastax.com>
>
Re: Digest mismatch
Posted by Steve Lacerda <st...@datastax.com>.
If you can determine the key, then you can determine which nodes do and do
not have the data. You may be able to glean a bit more information like
that, maybe one node is having problems, versus entire cluster.
On Wed, Dec 2, 2020 at 9:32 AM Joe Obernberger <jo...@gmail.com>
wrote:
> Clients are using an application.conf like:
>
> datastax-java-driver {
> basic.request.timeout = 60 seconds
> basic.request.consistency = ONE
> basic.contact-points = ["172.16.110.3:9042", "172.16.110.4:9042", "
> 172.16.100.208:9042", "172.16.100.224:9042", "172.16.100.225:9042", "
> 172.16.100.253:9042", "172.16.100.254:9042"]
> basic.load-balancing-policy {
> local-datacenter = datacenter1
> }
> }
>
> So no, I'm not using a token aware policy. I'm googling that now...cuz I
> don't know what it is!
>
> -Joe
> On 12/2/2020 12:18 PM, Carl Mueller wrote:
>
> Are you using token aware policy for the driver?
>
> If your writes are one and your reads are one, the propagation may not
> have happened depending on the coordinator that is used.
>
> TokenAware will make that a bit better.
>
> On Wed, Dec 2, 2020 at 11:12 AM Joe Obernberger <
> joseph.obernberger@gmail.com> wrote:
>
>> Hi Carl - thank you for replying.
>> I am using Cassandra 3.11.9-1
>>
>> Rows are not typically being deleted - I assume you're referring to
>> Tombstones. I don't think that should be the case here as I don't think
>> we've deleted anything here.
>> This is a test cluster and some of the machines are small (hence the one
>> node with 128 tokens and 14.6% - it has a lot less disk space than the
>> other nodes). This is one of the features that I really like with
>> Cassandra - being able to size nodes based on disk/CPU/RAM.
>>
>> All data is currently written with ONE. All data is read with ONE. I
>> can replicate this issue at will, so can try different things easily. I
>> tried changing the read process to use QUORUM and the issue still takes
>> place. Right now I'm running a 'nodetool repair' to see if that helps.
>> Our largest table 'doc' has the following stats:
>>
>> Table: doc
>> SSTable count: 28
>> Space used (live): 113609995010
>> Space used (total): 113609995010
>> Space used by snapshots (total): 0
>> Off heap memory used (total): 225006197
>> SSTable Compression Ratio: 0.37730474570644196
>> Number of partitions (estimate): 93641747
>> Memtable cell count: 0
>> Memtable data size: 0
>> Memtable off heap memory used: 0
>> Memtable switch count: 3712
>> Local read count: 891065091
>> Local read latency: NaN ms
>> Local write count: 7448281135
>> Local write latency: NaN ms
>> Pending flushes: 0
>> Percent repaired: 0.0
>> Bloom filter false positives: 988
>> Bloom filter false ratio: 0.00001
>> Bloom filter space used: 151149880
>> Bloom filter off heap memory used: 151149656
>> Index summary off heap memory used: 38654701
>> Compression metadata off heap memory used: 35201840
>> Compacted partition minimum bytes: 104
>> Compacted partition maximum bytes: 3379391
>> Compacted partition mean bytes: 3389
>> Average live cells per slice (last five minutes): NaN
>> Maximum live cells per slice (last five minutes): 0
>> Average tombstones per slice (last five minutes): NaN
>> Maximum tombstones per slice (last five minutes): 0
>> Dropped Mutations: 8174438
>>
>> Thoughts/ideas? Thank you!
>>
>> -Joe
>> On 12/2/2020 11:49 AM, Carl Mueller wrote:
>>
>> Why is one of your nodes only at 14.6% ownership? That's weird, unless
>> you have a small rowcount.
>>
>> Are you frequently deleting rows? Are you frequently writing rows at ONE?
>>
>> What version of cassandra?
>>
>>
>>
>> On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger <
>> joseph.obernberger@gmail.com> wrote:
>>
>>> Hi All - this is my first post here. I've been using Cassandra for
>>> several months now and am loving it. We are moving from Apache HBase to
>>> Cassandra for a big data analytics platform.
>>>
>>> I'm using java to get rows from Cassandra and very frequently get a
>>> java.util.NoSuchElementException when iterating through a ResultSet. If
>>> I retry this query again (often several times), it works. The debug log
>>> on the Cassandra nodes show this message:
>>> org.apache.cassandra.service.DigestMismatchException: Mismatch for key
>>> DecoratedKey
>>>
>>> My cluster looks like this:
>>>
>>> Datacenter: datacenter1
>>> =======================
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> -- Address Load Tokens Owns (effective) Host
>>> ID Rack
>>> UN 172.16.100.224 340.5 GiB 512 50.9%
>>> 8ba646ac-2b33-49de-a220-ae9842f18806 rack1
>>> UN 172.16.100.208 269.19 GiB 384 40.3%
>>> 4e0ba42f-649b-425a-857a-34497eb3036e rack1
>>> UN 172.16.100.225 282.83 GiB 512 50.4%
>>> 247f3d70-d13b-4d68-9a53-2ed58e01a63e rack1
>>> UN 172.16.110.3 409.78 GiB 768 63.2%
>>> 0abea102-06d2-4309-af36-a3163e8f00d8 rack1
>>> UN 172.16.110.4 330.15 GiB 512 50.6%
>>> 2a5ae735-6304-4e99-924b-44d9d5ec86b7 rack1
>>> UN 172.16.100.253 98.88 GiB 128 14.6%
>>> 6b528b0b-d7f7-4378-bba8-1857802d4f18 rack1
>>> UN 172.16.100.254 204.5 GiB 256 30.0%
>>> 87d0cb48-a57d-460e-bd82-93e6e52e93ea rack1
>>>
>>> I suspect this has to do with how I'm using consistency levels?
>>> Typically I'm using ONE. I just set the dclocal_read_repair_chance to
>>> 0.0, but I'm still seeing the issue. Any help/tips?
>>>
>>> Thank you!
>>>
>>> -Joe Obernberger
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>>> For additional commands, e-mail: user-help@cassandra.apache.org
>>>
>>>
>>
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.avg.com_email-2Dsignature-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dlink-26utm-5Fcampaign-3Dsig-2Demail-26utm-5Fcontent-3Demailclient&d=DwMDaQ&c=adz96Xi0w1RHqtPMowiL2g&r=R58SsZ6FLB8iCRFGJzNOH0d2HRPVtaWKKj5fzuMiGlo&m=Fyv9e8-h-x9SK5jhrHZ0E8GuNrgtlMqrzqMWJPRf6dc&s=LpYSwEkia1rRuqN2D9BD7zWhq-f4KX3JgbvVN3yEeDI&e=> Virus-free.
>> www.avg.com
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.avg.com_email-2Dsignature-3Futm-5Fmedium-3Demail-26utm-5Fsource-3Dlink-26utm-5Fcampaign-3Dsig-2Demail-26utm-5Fcontent-3Demailclient&d=DwMDaQ&c=adz96Xi0w1RHqtPMowiL2g&r=R58SsZ6FLB8iCRFGJzNOH0d2HRPVtaWKKj5fzuMiGlo&m=Fyv9e8-h-x9SK5jhrHZ0E8GuNrgtlMqrzqMWJPRf6dc&s=LpYSwEkia1rRuqN2D9BD7zWhq-f4KX3JgbvVN3yEeDI&e=>
>> <#m_1355843925756209451_m_1378452758220018548_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>
>>
--
Steve Lacerda
e. steve.lacerda@datastax.com
w. www.datastax.com
Re: Digest mismatch
Posted by Joe Obernberger <jo...@gmail.com>.
Clients are using an application.conf like:
datastax-java-driver {
basic.request.timeout = 60 seconds
basic.request.consistency = ONE
basic.contact-points = ["172.16.110.3:9042", "172.16.110.4:9042",
"172.16.100.208:9042", "172.16.100.224:9042", "172.16.100.225:9042",
"172.16.100.253:9042", "172.16.100.254:9042"]
basic.load-balancing-policy {
local-datacenter = datacenter1
}
}
So no, I'm not using a token aware policy. I'm googling that now...cuz
I don't know what it is!
-Joe
On 12/2/2020 12:18 PM, Carl Mueller wrote:
> Are you using token aware policy for the driver?
>
> If your writes are one and your reads are one, the propagation may not
> have happened depending on the coordinator that is used.
>
> TokenAware will make that a bit better.
>
> On Wed, Dec 2, 2020 at 11:12 AM Joe Obernberger
> <joseph.obernberger@gmail.com <ma...@gmail.com>>
> wrote:
>
> Hi Carl - thank you for replying.
> I am using Cassandra 3.11.9-1
>
> Rows are not typically being deleted - I assume you're referring
> to Tombstones. I don't think that should be the case here as I
> don't think we've deleted anything here.
> This is a test cluster and some of the machines are small (hence
> the one node with 128 tokens and 14.6% - it has a lot less disk
> space than the other nodes). This is one of the features that I
> really like with Cassandra - being able to size nodes based on
> disk/CPU/RAM.
>
> All data is currently written with ONE. All data is read with
> ONE. I can replicate this issue at will, so can try different
> things easily. I tried changing the read process to use QUORUM
> and the issue still takes place. Right now I'm running a 'nodetool
> repair' to see if that helps. Our largest table 'doc' has the
> following stats:
>
> Table: doc
> SSTable count: 28
> Space used (live): 113609995010
> Space used (total): 113609995010
> Space used by snapshots (total): 0
> Off heap memory used (total): 225006197
> SSTable Compression Ratio: 0.37730474570644196
> Number of partitions (estimate): 93641747
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 3712
> Local read count: 891065091
> Local read latency: NaN ms
> Local write count: 7448281135
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 0.0
> Bloom filter false positives: 988
> Bloom filter false ratio: 0.00001
> Bloom filter space used: 151149880
> Bloom filter off heap memory used: 151149656
> Index summary off heap memory used: 38654701
> Compression metadata off heap memory used: 35201840
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 3379391
> Compacted partition mean bytes: 3389
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 8174438
>
> Thoughts/ideas? Thank you!
>
> -Joe
>
> On 12/2/2020 11:49 AM, Carl Mueller wrote:
>> Why is one of your nodes only at 14.6% ownership? That's weird,
>> unless you have a small rowcount.
>>
>> Are you frequently deleting rows? Are you frequently writing rows
>> at ONE?
>>
>> What version of cassandra?
>>
>>
>>
>> On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger
>> <joseph.obernberger@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>> Hi All - this is my first post here. I've been using
>> Cassandra for
>> several months now and am loving it. We are moving from
>> Apache HBase to
>> Cassandra for a big data analytics platform.
>>
>> I'm using java to get rows from Cassandra and very frequently
>> get a
>> java.util.NoSuchElementException when iterating through a
>> ResultSet. If
>> I retry this query again (often several times), it works.
>> The debug log
>> on the Cassandra nodes show this message:
>> org.apache.cassandra.service.DigestMismatchException:
>> Mismatch for key
>> DecoratedKey
>>
>> My cluster looks like this:
>>
>> Datacenter: datacenter1
>> =======================
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> -- Address Load Tokens Owns (effective)
>> Host
>> ID Rack
>> UN 172.16.100.224 340.5 GiB 512 50.9%
>> 8ba646ac-2b33-49de-a220-ae9842f18806 rack1
>> UN 172.16.100.208 269.19 GiB 384 40.3%
>> 4e0ba42f-649b-425a-857a-34497eb3036e rack1
>> UN 172.16.100.225 282.83 GiB 512 50.4%
>> 247f3d70-d13b-4d68-9a53-2ed58e01a63e rack1
>> UN 172.16.110.3 409.78 GiB 768 63.2%
>> 0abea102-06d2-4309-af36-a3163e8f00d8 rack1
>> UN 172.16.110.4 330.15 GiB 512 50.6%
>> 2a5ae735-6304-4e99-924b-44d9d5ec86b7 rack1
>> UN 172.16.100.253 98.88 GiB 128 14.6%
>> 6b528b0b-d7f7-4378-bba8-1857802d4f18 rack1
>> UN 172.16.100.254 204.5 GiB 256 30.0%
>> 87d0cb48-a57d-460e-bd82-93e6e52e93ea rack1
>>
>> I suspect this has to do with how I'm using consistency levels?
>> Typically I'm using ONE. I just set the
>> dclocal_read_repair_chance to
>> 0.0, but I'm still seeing the issue. Any help/tips?
>>
>> Thank you!
>>
>> -Joe Obernberger
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> <ma...@cassandra.apache.org>
>> For additional commands, e-mail:
>> user-help@cassandra.apache.org
>> <ma...@cassandra.apache.org>
>>
>>
>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>> Virus-free. www.avg.com
>> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>>
>>
>> <#m_1378452758220018548_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
Re: Digest mismatch
Posted by Carl Mueller <ca...@smartthings.com.INVALID>.
Are you using token aware policy for the driver?
If your writes are one and your reads are one, the propagation may not have
happened depending on the coordinator that is used.
TokenAware will make that a bit better.
On Wed, Dec 2, 2020 at 11:12 AM Joe Obernberger <
joseph.obernberger@gmail.com> wrote:
> Hi Carl - thank you for replying.
> I am using Cassandra 3.11.9-1
>
> Rows are not typically being deleted - I assume you're referring to
> Tombstones. I don't think that should be the case here as I don't think
> we've deleted anything here.
> This is a test cluster and some of the machines are small (hence the one
> node with 128 tokens and 14.6% - it has a lot less disk space than the
> other nodes). This is one of the features that I really like with
> Cassandra - being able to size nodes based on disk/CPU/RAM.
>
> All data is currently written with ONE. All data is read with ONE. I can
> replicate this issue at will, so can try different things easily. I tried
> changing the read process to use QUORUM and the issue still takes place.
> Right now I'm running a 'nodetool repair' to see if that helps. Our
> largest table 'doc' has the following stats:
>
> Table: doc
> SSTable count: 28
> Space used (live): 113609995010
> Space used (total): 113609995010
> Space used by snapshots (total): 0
> Off heap memory used (total): 225006197
> SSTable Compression Ratio: 0.37730474570644196
> Number of partitions (estimate): 93641747
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 3712
> Local read count: 891065091
> Local read latency: NaN ms
> Local write count: 7448281135
> Local write latency: NaN ms
> Pending flushes: 0
> Percent repaired: 0.0
> Bloom filter false positives: 988
> Bloom filter false ratio: 0.00001
> Bloom filter space used: 151149880
> Bloom filter off heap memory used: 151149656
> Index summary off heap memory used: 38654701
> Compression metadata off heap memory used: 35201840
> Compacted partition minimum bytes: 104
> Compacted partition maximum bytes: 3379391
> Compacted partition mean bytes: 3389
> Average live cells per slice (last five minutes): NaN
> Maximum live cells per slice (last five minutes): 0
> Average tombstones per slice (last five minutes): NaN
> Maximum tombstones per slice (last five minutes): 0
> Dropped Mutations: 8174438
>
> Thoughts/ideas? Thank you!
>
> -Joe
> On 12/2/2020 11:49 AM, Carl Mueller wrote:
>
> Why is one of your nodes only at 14.6% ownership? That's weird, unless you
> have a small rowcount.
>
> Are you frequently deleting rows? Are you frequently writing rows at ONE?
>
> What version of cassandra?
>
>
>
> On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger <
> joseph.obernberger@gmail.com> wrote:
>
>> Hi All - this is my first post here. I've been using Cassandra for
>> several months now and am loving it. We are moving from Apache HBase to
>> Cassandra for a big data analytics platform.
>>
>> I'm using java to get rows from Cassandra and very frequently get a
>> java.util.NoSuchElementException when iterating through a ResultSet. If
>> I retry this query again (often several times), it works. The debug log
>> on the Cassandra nodes show this message:
>> org.apache.cassandra.service.DigestMismatchException: Mismatch for key
>> DecoratedKey
>>
>> My cluster looks like this:
>>
>> Datacenter: datacenter1
>> =======================
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> -- Address Load Tokens Owns (effective) Host
>> ID Rack
>> UN 172.16.100.224 340.5 GiB 512 50.9%
>> 8ba646ac-2b33-49de-a220-ae9842f18806 rack1
>> UN 172.16.100.208 269.19 GiB 384 40.3%
>> 4e0ba42f-649b-425a-857a-34497eb3036e rack1
>> UN 172.16.100.225 282.83 GiB 512 50.4%
>> 247f3d70-d13b-4d68-9a53-2ed58e01a63e rack1
>> UN 172.16.110.3 409.78 GiB 768 63.2%
>> 0abea102-06d2-4309-af36-a3163e8f00d8 rack1
>> UN 172.16.110.4 330.15 GiB 512 50.6%
>> 2a5ae735-6304-4e99-924b-44d9d5ec86b7 rack1
>> UN 172.16.100.253 98.88 GiB 128 14.6%
>> 6b528b0b-d7f7-4378-bba8-1857802d4f18 rack1
>> UN 172.16.100.254 204.5 GiB 256 30.0%
>> 87d0cb48-a57d-460e-bd82-93e6e52e93ea rack1
>>
>> I suspect this has to do with how I'm using consistency levels?
>> Typically I'm using ONE. I just set the dclocal_read_repair_chance to
>> 0.0, but I'm still seeing the issue. Any help/tips?
>>
>> Thank you!
>>
>> -Joe Obernberger
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: user-help@cassandra.apache.org
>>
>>
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> Virus-free.
> www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> <#m_1378452758220018548_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
>
Re: Digest mismatch
Posted by Joe Obernberger <jo...@gmail.com>.
Hi Carl - thank you for replying.
I am using Cassandra 3.11.9-1
Rows are not typically being deleted - I assume you're referring to
Tombstones. I don't think that should be the case here as I don't think
we've deleted anything here.
This is a test cluster and some of the machines are small (hence the one
node with 128 tokens and 14.6% - it has a lot less disk space than the
other nodes). This is one of the features that I really like with
Cassandra - being able to size nodes based on disk/CPU/RAM.
All data is currently written with ONE. All data is read with ONE. I
can replicate this issue at will, so can try different things easily. I
tried changing the read process to use QUORUM and the issue still takes
place. Right now I'm running a 'nodetool repair' to see if that helps.
Our largest table 'doc' has the following stats:
Table: doc
SSTable count: 28
Space used (live): 113609995010
Space used (total): 113609995010
Space used by snapshots (total): 0
Off heap memory used (total): 225006197
SSTable Compression Ratio: 0.37730474570644196
Number of partitions (estimate): 93641747
Memtable cell count: 0
Memtable data size: 0
Memtable off heap memory used: 0
Memtable switch count: 3712
Local read count: 891065091
Local read latency: NaN ms
Local write count: 7448281135
Local write latency: NaN ms
Pending flushes: 0
Percent repaired: 0.0
Bloom filter false positives: 988
Bloom filter false ratio: 0.00001
Bloom filter space used: 151149880
Bloom filter off heap memory used: 151149656
Index summary off heap memory used: 38654701
Compression metadata off heap memory used: 35201840
Compacted partition minimum bytes: 104
Compacted partition maximum bytes: 3379391
Compacted partition mean bytes: 3389
Average live cells per slice (last five minutes): NaN
Maximum live cells per slice (last five minutes): 0
Average tombstones per slice (last five minutes): NaN
Maximum tombstones per slice (last five minutes): 0
Dropped Mutations: 8174438
Thoughts/ideas? Thank you!
-Joe
On 12/2/2020 11:49 AM, Carl Mueller wrote:
> Why is one of your nodes only at 14.6% ownership? That's weird, unless
> you have a small rowcount.
>
> Are you frequently deleting rows? Are you frequently writing rows at ONE?
>
> What version of cassandra?
>
>
>
> On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger
> <joseph.obernberger@gmail.com <ma...@gmail.com>>
> wrote:
>
> Hi All - this is my first post here. I've been using Cassandra for
> several months now and am loving it. We are moving from Apache
> HBase to
> Cassandra for a big data analytics platform.
>
> I'm using java to get rows from Cassandra and very frequently get a
> java.util.NoSuchElementException when iterating through a
> ResultSet. If
> I retry this query again (often several times), it works. The
> debug log
> on the Cassandra nodes show this message:
> org.apache.cassandra.service.DigestMismatchException: Mismatch for
> key
> DecoratedKey
>
> My cluster looks like this:
>
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Tokens Owns (effective) Host
> ID Rack
> UN 172.16.100.224 340.5 GiB 512 50.9%
> 8ba646ac-2b33-49de-a220-ae9842f18806 rack1
> UN 172.16.100.208 269.19 GiB 384 40.3%
> 4e0ba42f-649b-425a-857a-34497eb3036e rack1
> UN 172.16.100.225 282.83 GiB 512 50.4%
> 247f3d70-d13b-4d68-9a53-2ed58e01a63e rack1
> UN 172.16.110.3 409.78 GiB 768 63.2%
> 0abea102-06d2-4309-af36-a3163e8f00d8 rack1
> UN 172.16.110.4 330.15 GiB 512 50.6%
> 2a5ae735-6304-4e99-924b-44d9d5ec86b7 rack1
> UN 172.16.100.253 98.88 GiB 128 14.6%
> 6b528b0b-d7f7-4378-bba8-1857802d4f18 rack1
> UN 172.16.100.254 204.5 GiB 256 30.0%
> 87d0cb48-a57d-460e-bd82-93e6e52e93ea rack1
>
> I suspect this has to do with how I'm using consistency levels?
> Typically I'm using ONE. I just set the
> dclocal_read_repair_chance to
> 0.0, but I'm still seeing the issue. Any help/tips?
>
> Thank you!
>
> -Joe Obernberger
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> <ma...@cassandra.apache.org>
> For additional commands, e-mail: user-help@cassandra.apache.org
> <ma...@cassandra.apache.org>
>
>
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> Virus-free. www.avg.com
> <http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
>
>
> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
Re: Digest mismatch
Posted by Carl Mueller <ca...@smartthings.com.INVALID>.
Why is one of your nodes only at 14.6% ownership? That's weird, unless you
have a small rowcount.
Are you frequently deleting rows? Are you frequently writing rows at ONE?
What version of cassandra?
On Wed, Dec 2, 2020 at 9:56 AM Joe Obernberger <jo...@gmail.com>
wrote:
> Hi All - this is my first post here. I've been using Cassandra for
> several months now and am loving it. We are moving from Apache HBase to
> Cassandra for a big data analytics platform.
>
> I'm using java to get rows from Cassandra and very frequently get a
> java.util.NoSuchElementException when iterating through a ResultSet. If
> I retry this query again (often several times), it works. The debug log
> on the Cassandra nodes show this message:
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key
> DecoratedKey
>
> My cluster looks like this:
>
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Tokens Owns (effective) Host
> ID Rack
> UN 172.16.100.224 340.5 GiB 512 50.9%
> 8ba646ac-2b33-49de-a220-ae9842f18806 rack1
> UN 172.16.100.208 269.19 GiB 384 40.3%
> 4e0ba42f-649b-425a-857a-34497eb3036e rack1
> UN 172.16.100.225 282.83 GiB 512 50.4%
> 247f3d70-d13b-4d68-9a53-2ed58e01a63e rack1
> UN 172.16.110.3 409.78 GiB 768 63.2%
> 0abea102-06d2-4309-af36-a3163e8f00d8 rack1
> UN 172.16.110.4 330.15 GiB 512 50.6%
> 2a5ae735-6304-4e99-924b-44d9d5ec86b7 rack1
> UN 172.16.100.253 98.88 GiB 128 14.6%
> 6b528b0b-d7f7-4378-bba8-1857802d4f18 rack1
> UN 172.16.100.254 204.5 GiB 256 30.0%
> 87d0cb48-a57d-460e-bd82-93e6e52e93ea rack1
>
> I suspect this has to do with how I'm using consistency levels?
> Typically I'm using ONE. I just set the dclocal_read_repair_chance to
> 0.0, but I'm still seeing the issue. Any help/tips?
>
> Thank you!
>
> -Joe Obernberger
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@cassandra.apache.org
> For additional commands, e-mail: user-help@cassandra.apache.org
>
>