You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Maxim Muzafarov (Jira)" <ji...@apache.org> on 2022/05/25 11:13:00 UTC
[jira] [Commented] (IGNITE-11998) Fix DataPageScan for fragmented pages.

    [ https://issues.apache.org/jira/browse/IGNITE-11998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541980#comment-17541980 ] 

Maxim Muzafarov commented on IGNITE-11998:
------------------------------------------

h4. The inital proposal

Currently, during a full scan of a cache group partition (SqlQuery or ScanQuery) all the data is read though the partition B-Tree and this in turn leads to the _n(log n)_ complexity. For such a queries it may be necessary to read all the data by sequential pages read directly from the partition file which has the _n_ complexity and also the sequential file reads has some benefits instead of random access file reads.

h4. The main issue

Accoring to the [Ignite Multi-Tier Storage - under the hood|https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Multi-Tier+Storage+-+under+the+hood#IgniteMultiTierStorageunderthehood-Longobjects] long objects are splitted on the several pages. For the pages which are contain an entry tail there is no any dedicated page attribute or page header flag to identify such a pages, however, such a pages have a link to an other fragment or a entry head. These pages may only be accessed from the page which contain the entry head.

h4. Current solution and benchmarks

_The double loop over the all partition pages. _ 

During the first loop we are reading all the pages and collecting references to the other pages (reading entries are performed from the head to tail, writing entries are preformed from the tail to head). On the second loop we are building the list of pages that doesn't have a references on itself - and these are the pages that containing the entries headers to be read.

||Data Page Scan||true||false||
|IgniteDataPageScanBenchmark|148848|179228|
|IgniteDataPageScanBenchmark|186917|166980|
|IgniteDataPageScanBenchmark|197114|175667|

h4. Possible solutions

An additional analysis and investigation required to perform the full partition scan using only the one loop. We need to identify the fragmented pages with entries tails:
- for such a pages we can write the {{freeSpace}}, {{directCounter}}, {{indirectCounter}} e.g. {{-1}} value (currently it's zero) and here we need check the pds compatibility.
- almost the same issue with identifying fragmented pages are here - IGNITE-12510


> Fix DataPageScan for fragmented pages.
> --------------------------------------
>
>                 Key: IGNITE-11998
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11998
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Ivan Bessonov
>            Assignee: Maxim Muzafarov
>            Priority: Critical
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Fragmented pages crash JVM when accessed by DataPageScan scanner/query optimized scanner. It happens when scanner accesses data in later chunk in fragmented entry but treats it like the first one, expecting length of the payload, which is absent and replaced with raw entry data.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)