You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Vyacheslav Koptilin (Jira)" <ji...@apache.org> on 2022/10/18 11:47:00 UTC
[jira] [Updated] (IGNITE-17835) partition lost check improvement

     [ https://issues.apache.org/jira/browse/IGNITE-17835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vyacheslav Koptilin updated IGNITE-17835:
-----------------------------------------
    Description: 
Start two nodes with native persistent enabled, and then activate it.

create a table with no backups, sql like follows:
{noformat}
CREATE TABLE City (
  ID INT,
  Name VARCHAR,
  CountryCode CHAR(3),
  District VARCHAR,
  Population INT,
  PRIMARY KEY (ID, CountryCode)
) WITH "template=partitioned, affinityKey=CountryCode, CACHE_NAME=City, KEY_TYPE=demo.model.CityKey, VALUE_TYPE=demo.model.City";
INSERT INTO City(ID, Name, CountryCode, District, Population) VALUES (1,'Kabul','AFG','Kabol',1780000);
INSERT INTO City(ID, Name, CountryCode, District, Population) VALUES (2,'Qandahar','AFG','Qandahar',237500);
{noformat}

then execute 
{noformat}
SELECT COUNT FROM city;
{noformat}

The result is OK.

then kill one node and then execute 
{noformat}SELECT COUNT(*) FROM city;{noformat}

The result is
{noformat}Failed to execute query because cache partition has been lostPart [cacheName=City, part=0]{noformat}

This is expected behavior as well.

Next, start the node that was shut down before and execute the same request: {noformat}SELECT COUNT(*) FROM city;{noformat}

The result is the following:
{noformat}Failed to execute query because cache partition has been lostPart [cacheName=City, part=0]{noformat}

At this time, all partitions have been recovered, and all baseline nodes are ONLINE. Execute reset_lost_partitions operation at this time seems redundant.

  was:
Start two nodes with native persistent enabled, and then activate it.

create a table with no backups, sql like follows:

CREATE TABLE City (
  ID INT,
  Name VARCHAR,
  CountryCode CHAR(3),
  District VARCHAR,
  Population INT,
  PRIMARY KEY (ID, CountryCode)
) WITH "template=partitioned, affinityKey=CountryCode, CACHE_NAME=City, KEY_TYPE=demo.model.CityKey, VALUE_TYPE=demo.model.City";

INSERT INTO City(ID, Name, CountryCode, District, Population) VALUES (1,'Kabul','AFG','Kabol',1780000);
INSERT INTO City(ID, Name, CountryCode, District, Population) VALUES (2,'Qandahar','AFG','Qandahar',237500);

then execute SELECT COUNT(*) FROM city; 


normal.

then kill one node.

then execute SELECT COUNT(*) FROM city; 


Failed to execute query because cache partition has been lostPart [cacheName=City, part=0]

this alse normal.

Next, start the node that was shut down before.

then execute SELECT COUNT(*) FROM city; 


Failed to execute query because cache partition has been lostPart [cacheName=City, part=0]

At this time, all partitions have been recovered, and all baseline nodes are ONLINE. Execute reset_lost_partitions operation at this time seems redundant.


> partition lost check improvement
> --------------------------------
>
>                 Key: IGNITE-17835
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17835
>             Project: Ignite
>          Issue Type: Improvement
>          Components: cache
>    Affects Versions: 2.13
>            Reporter: YuJue Li
>            Priority: Major
>             Fix For: 2.15
>
>
> Start two nodes with native persistent enabled, and then activate it.
> create a table with no backups, sql like follows:
> {noformat}
> CREATE TABLE City (
>   ID INT,
>   Name VARCHAR,
>   CountryCode CHAR(3),
>   District VARCHAR,
>   Population INT,
>   PRIMARY KEY (ID, CountryCode)
> ) WITH "template=partitioned, affinityKey=CountryCode, CACHE_NAME=City, KEY_TYPE=demo.model.CityKey, VALUE_TYPE=demo.model.City";
> INSERT INTO City(ID, Name, CountryCode, District, Population) VALUES (1,'Kabul','AFG','Kabol',1780000);
> INSERT INTO City(ID, Name, CountryCode, District, Population) VALUES (2,'Qandahar','AFG','Qandahar',237500);
> {noformat}
> then execute 
> {noformat}
> SELECT COUNT FROM city;
> {noformat}
> The result is OK.
> then kill one node and then execute 
> {noformat}SELECT COUNT(*) FROM city;{noformat}
> The result is
> {noformat}Failed to execute query because cache partition has been lostPart [cacheName=City, part=0]{noformat}
> This is expected behavior as well.
> Next, start the node that was shut down before and execute the same request: {noformat}SELECT COUNT(*) FROM city;{noformat}
> The result is the following:
> {noformat}Failed to execute query because cache partition has been lostPart [cacheName=City, part=0]{noformat}
> At this time, all partitions have been recovered, and all baseline nodes are ONLINE. Execute reset_lost_partitions operation at this time seems redundant.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)