You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@cassandra.apache.org by "Alex Petrov (JIRA)" <ji...@apache.org> on 2017/04/06 15:07:41 UTC

[jira] [Updated] (CASSANDRA-13302) last row of previous page == first row of next page while querying data using SASI index

     [ https://issues.apache.org/jira/browse/CASSANDRA-13302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alex Petrov updated CASSANDRA-13302:
------------------------------------
    Status: Patch Available  (was: Open)

Attaching a simple patch and test reproducing the problem with paging. It is happening only in tables without clustering, because of the left bound inclusion on subsequent pages.

|[trunk|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13379-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13379-trunk-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13379-trunk-dtest/]|

> last row of previous page == first row of next page while querying data using SASI index
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-13302
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13302
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Tested with C* 3.9 and 3.10.
>            Reporter: Andy Tolbert
>            Assignee: Alex Petrov
>
> Apologies if this is a duplicate (couldn't track down an existing bug).
> Similarly to [CASSANDRA-11208], it appears it is possible to retrieve duplicate rows when paging using a SASI index as documented in [JAVA-1413|https://datastax-oss.atlassian.net/browse/JAVA-1413], the following test demonstrates that data is repeated while querying using a SASI index:
> {code:java}
> public class TestPagingBug
> {
> 	public static void main(String[] args)
> 	{
> 		Cluster.Builder builder = Cluster.builder();
> 		Cluster c = builder.addContactPoints("192.168.98.190").build();		
> 		Session s = c.connect();
> 		
> 		s.execute("CREATE KEYSPACE IF NOT EXISTS test WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 }");
> 		s.execute("CREATE TABLE IF NOT EXISTS test.test_table_sec(sec BIGINT PRIMARY KEY, id INT)");
>                 //create secondary index on ID column, used for select statement
>                 String index = "CREATE CUSTOM INDEX test_table_sec_idx ON test.test_table_sec (id) USING 'org.apache.cassandra.index.sasi.SASIIndex' "
>                 + "WITH OPTIONS = { 'mode': 'PREFIX' }";
>                 s.execute(index);
> 		
> 		PreparedStatement insert = s.prepare("INSERT INTO test.test_table_sec (id, sec) VALUES (1, ?)");		
> 		for (int i = 0; i < 1000; i++)
> 			s.execute(insert.bind((long) i));
> 		
> 		PreparedStatement select = s.prepare("SELECT sec FROM test.test_table_sec WHERE id = 1");
> 		
> 		long lastSec = -1;		
> 		for (Row row : s.execute(select.bind().setFetchSize(300)))
> 		{
> 			long sec = row.getLong("sec");
> 			if (sec == lastSec)
> 				System.out.println(String.format("Duplicated id %d", sec));
> 			
> 			lastSec = sec;
> 		}
> 		System.exit(0);
> 	}
> }
> {code}
> The program outputs the following:
> {noformat}
> Duplicated id 23
> Duplicated id 192
> Duplicated id 684
> {noformat}
> Note that the simple primary key is required to reproduce this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)