You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Maxim Muzafarov (Jira)" <ji...@apache.org> on 2020/05/22 11:33:00 UTC

[jira] [Created] (IGNITE-13063) Bottom-up index rebuild

Maxim Muzafarov created IGNITE-13063:
----------------------------------------

             Summary: Bottom-up index rebuild
                 Key: IGNITE-13063
                 URL: https://issues.apache.org/jira/browse/IGNITE-13063
             Project: Ignite
          Issue Type: Improvement
            Reporter: Maxim Muzafarov
            Assignee: Maxim Muzafarov


As part of [IEP-22: Direct Data Load|https://cwiki.apache.org/confluence/display/IGNITE/IEP-22%3A+Direct+Data+Load] the PoC needs to be implemented for the new algorithm of rebuilding an index.
 Compare the approach of the bottom-up index rebuild with the default implementation (from the root).

See details in the IEP-22.
h4. High-level overview

We will not update PK and secondary indexes during the data load, so it is necessary to rebuild them in the end. The most efficient way to build indexes is bottom-up approach, when the lowest level of BTree is built first, and the root is build last. We will need a buffer where indexed values and respective links will be sorted in index order. If the buffer is big enough and all the data fits into it, index will be created in one hop. Otherwise it is necessary to sort indexed values in several runs using an external sort. It is necessary to let users configure sort parameters - buffer size (ideally - in bytes), and the file system path where temp files will be stored. The latter is critical - typically users would like to keep temp files on a separate disk, so that WAL and checkpoint operations are not affected.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)