You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "feroz shaik (Jira)" <ji...@apache.org> on 2019/09/26 12:41:00 UTC
[jira] [Commented] (CASSANDRA-15263) LegacyLayout RangeTombstoneList throws java.lang.NullPointerException: null

    [ https://issues.apache.org/jira/browse/CASSANDRA-15263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938558#comment-16938558 ] 

feroz shaik commented on CASSANDRA-15263:
-----------------------------------------

Dear [~benedict] & Shalom, 

I have been working on this lately with my engineering and business to move with upgrade task but there is some ambiguity and checking with you if you can assist. We want to be absolutely sure that post upgrade the app continues to work as normal and this can be only answered by providing test results. So i have started testing this on my lab (3 nodes). 

I was able to reproduce the error with some other table but cant replicate the problem using my table that app uses.

I got the exact query patterns from the application team by enabling debug:

static String INSERT_SCHEDULED_DELETION_QUERY = "INSERT INTO \"sal_purge\" (key,column1,value) VALUES (?,?,?) USING TIMESTAMP ?;"; static String SELECT_SCHEDULED_DELETION_QUERY = "SELECT column1, value FROM sal_purge where key=? AND column1>=? LIMIT ?;"; static String DELETE_SCHEDULED_DELETION_QUERY = "DELETE FROM \"sal_purge\" USING TIMESTAMP ? WHERE key=? AND column1=?;";

 

My table structure if you remember:

CREATE TABLE "SAL".sal_purge (CREATE TABLE "SAL".sal_purge (     key text,     column1 text,     column2 text,     value text,     PRIMARY KEY (key, column1, column2) ) WITH COMPACT STORAGE     AND CLUSTERING ORDER BY (column1 ASC, column2 ASC)     AND bloom_filter_fp_chance = 0.1     AND caching = '\{"keys":"NONE", "rows_per_partition":"NONE"}'     AND comment = 'Holds items to be removed as [shardid][salid][timestamp]. The table records SALIDs to be deleted along with their deletion times (which may be modified)'     AND compaction = \{'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}     AND compression = \{'chunk_length_kb': '64', 'sstable_compression': 'org.apache.cassandra.io.compress.SnappyCompressor'}     AND dclocal_read_repair_chance = 0.0     AND default_time_to_live = 0     AND gc_grace_seconds = 864000     AND max_index_interval = 2048     AND memtable_flush_period_in_ms = 0     AND min_index_interval = 128     AND read_repair_chance = 0.1    AND speculative_retry = '99.0PERCENTILE';

 

**I really cannot understand how can this "DELETE FROM \"sal_purge\" USING TIMESTAMP ? WHERE key=? AND column1=?;"; cause a range tombstone..I can only see they can be row deletions isnt it? 

 

 

 Now, I tried to insert some data, do some delete, upgrade 1st node while i hit a range select from node 2. I am still unable to replicate the problem.

cqlsh> select * from "SAL".sal_purge;cqlsh> select * from "SAL".sal_purge; key | column1                                                          | column2 | value -----+------------------------------------------------------------------+---------+--------------- 15e | 15e000b946229403b2010e542724cb9af7b939df0c5dd7c5cb680b28881b0905 |    null | 1568628121530 15e | 15e000cdbcadb3ea81fe66a2023ccae8dde2cd1c0cdcd25c1b011f5cf4520411 |    null | 1568661805145 15e | 15e000d5d1b5ea0e692046b51901d6efe7e1e9beb1f873f5bf62cd9b8b78b6a7 |    null | 1568717250061 15e | 15e00205a8a8beafb571f0824022061975f239ffdfec02ae5ec2282d6db76e5b |    null | 1568776769724 15e | 15e00322575230ed208cfa101afac66e790582ecd179339e16ee811d41bbac08 |    null | 1568755358394 15e | 15e003fb651375db41cbcba3c37f0ab02f9308ba2e3708a5e7be359189583f26 |    null | 1568630537539 15e | 15e0049bce42be7e46c7beb6f8f878abd83350556de3864477e94fc33643e818 |    null | 1568675019827 15e | 15e005cb926ca2b723260927c3ba9d16ee8092d6adb78d01e129815617e26251 |    null | 1568642160143 15e | 15e007f24dd7c31358a23869b06fbc24095e133cdee9cc9e5af88e2a836e9c03 |    null | 1568749318982 15e | 15e0088cdb99f399290e531a452e4c2c32d547a852607cb2819ff8fbd565ed53 |    null | 1568690314042 (10 rows)

 

cqlsh> DELETE FROM "SAL".sal_purge USING TIMESTAMP 400 where key='15e' and column1='15e0049bce42be7e46c7beb6f8f878abd83350556de3864477e94fc33643e818';cqlsh> DELETE FROM "SAL".sal_purge USING TIMESTAMP 400 where key='15e' and column1='15e0049bce42be7e46c7beb6f8f878abd83350556de3864477e94fc33643e818'; cqlsh> DELETE FROM "SAL".sal_purge USING TIMESTAMP 300 where key='15e' and column1='15e005cb926ca2b723260927c3ba9d16ee8092d6adb78d01e129815617e26251'; cqlsh> DELETE FROM "SAL".sal_purge USING TIMESTAMP 200 where key='15e' and column1='15e007f24dd7c31358a23869b06fbc24095e133cdee9cc9e5af88e2a836e9c03'; cqlsh> cqlsh> cqlsh> select key,column1,writetime(value) from "SAL".sal_purge  where key='15e'; key | column1                                                          | writetime(value) -----+------------------------------------------------------------------+------------------ 15e | 15e000b946229403b2010e542724cb9af7b939df0c5dd7c5cb680b28881b0905 |             1000 15e | 15e000cdbcadb3ea81fe66a2023ccae8dde2cd1c0cdcd25c1b011f5cf4520411 |              900 15e | 15e000d5d1b5ea0e692046b51901d6efe7e1e9beb1f873f5bf62cd9b8b78b6a7 |              800 15e | 15e00205a8a8beafb571f0824022061975f239ffdfec02ae5ec2282d6db76e5b |              700 15e | 15e00322575230ed208cfa101afac66e790582ecd179339e16ee811d41bbac08 |              600 15e | 15e0088cdb99f399290e531a452e4c2c32d547a852607cb2819ff8fbd565ed53 |              100 (6 rows)

 

The upgrade on node 1 was smooth and successful without any exceptions/ Am i missing something here so that i am unable to replicate the problem ?

 

 

> LegacyLayout RangeTombstoneList throws java.lang.NullPointerException: null
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15263
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15263
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Schema
>            Reporter: feroz shaik
>            Assignee: Benedict Elliott Smith
>            Priority: Normal
>              Labels: 2.1.16, 3.11.4
>         Attachments: sample.system.log, schema.txt, sstabledump_sal_purge_d03.json, sstablemetadata_sal_purge_d03, stack_trace.txt, system.log, system.log, system.log, system.log, system_latest.log
>
>
> We have  hit a problem today while upgrading from 2.1.16 to 3.11.4.
> we encountered this as soon as the first node started up with 3.11.4 
> The full error stack is attached - [^stack_trace.txt] 
>  
> The below errors continued in the log file as long as the process was up.
> ERROR [Native-Transport-Requests-12] 2019-08-06 03:00:47,135 ErrorMessage.java:384 - Unexpected exception during request
>  java.lang.NullPointerException: null
>  ERROR [Native-Transport-Requests-8] 2019-08-06 03:00:48,778 ErrorMessage.java:384 - Unexpected exception during request
>  java.lang.NullPointerException: null
>  ERROR [Native-Transport-Requests-13] 2019-08-06 03:00:57,454 
>  
> The nodetool version says 3.11.4 and the no of connections on native por t- 9042 was similar to other nodes. The exceptions were scary that we had to call off the change. Any help and insights to this problem from the community is appreciated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org