You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Julia Bakulina (Jira)" <ji...@apache.org> on 2023/05/12 09:33:00 UTC
[jira] [Updated] (IGNITE-17157) Documentation of the Ignite index reader

     [ https://issues.apache.org/jira/browse/IGNITE-17157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Julia Bakulina updated IGNITE-17157:
------------------------------------
    Release Note: Added documentation for Ignite index-reader

> Documentation of the Ignite index reader
> ----------------------------------------
>
>                 Key: IGNITE-17157
>                 URL: https://issues.apache.org/jira/browse/IGNITE-17157
>             Project: Ignite
>          Issue Type: Task
>            Reporter: Denis Chudov
>            Assignee: Julia Bakulina
>            Priority: Major
>              Labels: documentation, ise
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> It would be nice to have a documentation for the Ignite index reader utility that was added in IGNITE-14529.
> {panel:title=Draft}
> // Here should also be an overview with the description of the purposes of the utility
> To run this utility, use index-reader.sh/index-reader.bat script from Ignite *bin* directory.
> *Command line parameters:*
> *--dir*: partition directory, where index.bin and (optionally) partition files are located.
> *--part-cnt*: full partitions count in cache group. Default value: 0
> *--page-size*: page size. Default value: 4096
> *--page-store-ver*: page store version. Default value: 2
> *--indexes*: you can specify index tree names that will be processed, separated by comma without spaces, other index trees will be skipped. Default value: null. Index tree names are not the same as index names, they have format _cacheId_typeId_indexName##H2Tree%segmentNumber_, e.g. 
> {{2652_885397586_T0_F0_F1_IDX##H2Tree%0}}. You can see them in utility output, in traversal information sections (RECURSIVE and HORIZONTAL).
> *--dest-file*: file to print the report to (by default report is printed to console). Default value: null
> *--check-parts*: check cache data tree in partition files and it's consistency with indexes. Default value: false
> The utility can analyze index.bin and optionally partitions, if *--part-cnt* greater that 0 and partition files are present, to read CacheDataTree and to look into data pages to check their availability. It reads all index trees from index.bin and traverses them in two ways:
> - recursive traversal from root to leaves
> - traversal by each level, as all pages on one level are connected through forward ids.
> Also it reads page reuse lists. After all, it scans all pages in file, trying to detect orphan pages (those which don’t have any references from index trees and reuse lists).
> So, the output of the IgniteIndexReader consists of 4 main sections:
> - recursive traversal info (with prefix <RECURSIVE>)
> - horizontal traversal info (with prefix <HORIZONTAL>)
> - page reuse lists info (with prefix <PAGE_LIST>)
> - sequential scan of all pages.
> Optionally, with *--check-parts* parameter, it can have information about how CacheDataTree matches SQL indexes. If there are no errors, then there is only message like this:
> {noformat}
> Partitions check detected no errors.
> Partition check finished, total errors: 0, total problem partitions: 0
> {noformat}
> Otherwise, there is “Partitions check:“ section with list of errors. For example, this is how looks message about the entry that was found in CacheDataTree, but was not found in SQL indexes:
> {noformat}
> <ERROR> Errors detected in partition, partId=1023
> <ERROR> Entry is missing in index: I [idxName=2652_885397586_T0_F0_F1_IDX##H2Tree%0, pageId=0002ffff0000000d], cacheId=2652, partId=1023, pageIndex=8, itemId=0, link=285868728254472
> <ERROR> Entry is missing in index: I [idxName=2652_885397586_T0_F2_IDX##H2Tree%0, pageId=0002ffff0000000b], cacheId=2652, partId=1023, pageIndex=8, itemId=0, link=285868728254472
> All errors in the output have prefix <ERROR>.
> {noformat}
> h3. Command line examples
> Analyze files from /gridgain/corrupted_idxs, there should be also 1024 partitions in this cache group (some of partition files can be missing if node where they have been received from was not owning these partitions), use pageSize=4096 and page store version 2, report goes to report.txt:
> {noformat}
> ./index-reader.sh --dir "/gridgain/corrupted_idxs" --part-cnt 1024 --page-size 4096 --page-store-ver 2  --dest-file "report.txt"
> {noformat}
> Read only SQL indexes:
> {noformat}
> ./index-reader.sh --dir "/gridgain/corrupted_idxs" --dest-file "report.txt"
> {noformat}
> Read SQL indexes and check cache data tree in partitions:
> {noformat}
> ./index-reader.sh --dir "/gridgain/corrupted_idxs" --part-cnt 1024 --check-parts --dest-file "rep
> {noformat}
> h3. Output samples
> <RECURSIVE> and <HORIZONTAL> output sections contain information about index trees: tree name, root page id, page type statistics, count of items. The format for both traversals is the same.
> {noformat}
> <RECURSIVE> Index tree: I [idxName=2654_-1177891018__key_PK##H2Tree%0, pageId=0202ffff00000066]
> <RECURSIVE> -- Page stat:
> <RECURSIVE> H2ExtrasLeafIO: 2
> <RECURSIVE> H2ExtrasInnerIO: 1
> <RECURSIVE> BPlusMetaIO: 1
> <RECURSIVE> -- Count of items found in leaf pages: 200
> <RECURSIVE> No errors occurred while traversing.
> ...
> <RECURSIVE> Total trees: 19
> <RECURSIVE> Total pages found in trees: 49
> <RECURSIVE> Total errors during trees traversal: 2
> {noformat}
> Page lists section also contains reuse list bucket data with list meta, bucket number and start pages of lists found in bucket. It also contains page type statistics:
> {noformat}
> <PAGE_LIST> Page lists info.
> <PAGE_LIST> ---Printing buckets data:
> <PAGE_LIST> List meta id=844420635164675, bucket number=0, lists=[844420635164687]
> <PAGE_LIST> -- Page stat:
> <PAGE_LIST> H2ExtrasLeafIO: 32
> <PAGE_LIST> H2ExtrasInnerIO: 1
> <PAGE_LIST> BPlusMetaIO: 1
> <PAGE_LIST> ---No errors.
> {noformat}
> So does the sequential scan info:
> {noformat}
> ---These pages types were encountered during sequential scan:
> H2ExtrasLeafIO: 165
> H2ExtrasInnerIO: 19
> PagesListNodeIO: 1
> PagesListMetaIO: 1
> MetaStoreLeafIO: 5
> BPlusMetaIO: 20
> PageMetaIO: 1
> MetaStoreInnerIO: 1
> TrackingPageIO: 1
> ---
> Total pages encountered during sequential scan: 214
> Total errors occurred during sequential scan: 0
> {noformat}
> Index reader compares the results of both traversals and sizes of indexes of same caches, so you should just be aware of errors. E.g. error message about index size inconsistency looks like this:
> {noformat}
> <ERROR> Index size inconsistency: cacheId=2652, typeId=885397586
> <ERROR>      Index name: I [idxName=2652_885397586_T0_F0_F1_IDX##H2Tree%0, pageId=0002ffff0000000d], size=1700
> <ERROR>      Index name: I [idxName=2652_885397586__key_PK##H2Tree%0, pageId=0002ffff00000005], size=0
> <ERROR>      Index name: I [idxName=2652_885397586_T0_F1_IDX##H2Tree%0, pageId=0002ffff00000009], size=1700
> <ERROR>      Index name: I [idxName=2652_885397586_T0_F0_IDX##H2Tree%0, pageId=0002ffff00000007], size=1700
> <ERROR>      Index name: I [idxName=2652_885397586_T0_F2_IDX##H2Tree%0, pageId=0002ffff0000000b]
> {noformat}
> {panel}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)