You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Zhenya Stanilovsky <ar...@mail.ru.INVALID> on 2020/08/20 09:40:30 UTC

Re[2]: Update of the default inline size for variable types

Huge +1 with Ilya
I check your pr, this looks like stub :  Pattern . compile( " \\ w+ \\ (( \\ d+) \\ ) " );
*  Do we have some normalization before it ? varchar(whitespace + N) looks like not matching.
*  Can we obtain this info not from regexp ?
>Hello!
>
>I can see where you are getting at but, as far as my experience tells me,
>64 is already too large for the average use case. It will also start to
>drag on the performance since you don't have too many entries in one page
>anymore, and your tree starts to grow up, not to mention more i/o.
>
>I think we should benchmark it, see at which value we see a sharp decline.
>Maybe 64 is OK after all, if it's a maximum for a complex index. Just make
>sure that a single VARCHAR without length is still 10 and not 64.
>
>Regards,
>--
>Ilya Kasnacheev
>
>
>чт, 20 авг. 2020 г. в 11:15, Evgeniy Rudenko < e.a.rudenko@gmail.com >:
> 
>> Hi guys,
>>
>> Thank you for your feedback.
>>
>> Current calculation of the default size is not completely correct. If it
>> meets a field of the variable length (such as byte array or string) it just
>> stops any attempt to make index size more reasonable and uses
>> IGNITE_MAX_INDEX_PAYLOAD_SIZE_DEFAULT as its size. Such approach doesn't
>> seem correct to me in any case. First part of the update changes this logic
>> and starts to calculate size based on all indexed columns. This update can
>> even save some space for the users with varchars and high
>> IGNITE_MAX_INDEX_PAYLOAD_SIZE_DEFAULT value.
>>
>> Second part of the update increases IGNITE_MAX_INDEX_PAYLOAD_SIZE_DEFAULT.
>> Please note that we are changing only upper bound of the default size.
>> Obviously this can lead to some increase of the used space, but we are
>> trading size for the speed here. Current default value is too small for the
>> average usage case. Users which care about size of the data still can set
>> exact size of each index or limit all sizes by
>> IGNITE_MAX_INDEX_PAYLOAD_SIZE_DEFAULT. So after the update users which
>> would want to keep previous data size will just need to set
>> IGNITE_MAX_INDEX_PAYLOAD_SIZE_DEFAULT=10.
>>
>>
>>
>> On Wed, Aug 19, 2020 at 5:20 PM Vladislav Pyatkov < vldpyatkov@gmail.com >
>> wrote:
>>
>> > Hi,
>> >
>> > In my mind, the inline size 64 will be able to significant grow of
>> storage
>> > size.
>> > It can be difficult to understand by users.
>> >
>> > Earlier I remember we panned to replace inline value to hash code in the
>> > case where size of value more than inline size.
>> > It will help to comparison of "==", "!=", but will not grow size of
>> > storage.
>> >
>> > I think optimization with hash code looks more preferable and in last way
>> > anyone can to grow size of baseline though API.
>> >
>> >
>> > On Wed, Aug 19, 2020 at 9:22 AM Zhenya Stanilovsky
>> > < arzamas123@mail.ru.invalid > wrote:
>> >
>> > >
>> > >
>> > > >Hi guys,
>> > >
>> > > Evgeniy, hola!
>> > > >
>> > > >Currently if a varlength type (such as String or byte[]) is
>> encountered
>> > in
>> > > >the composite index inline size just defaults to 10, which is almost
>> > > always
>> > > >not enough. I am going to change this and implement following changes:
>> > > >
>> > > >1) For a column of the variable length keep using 10 as the default
>> size
>> > > in
>> > > >case of the one-column index. But if the index is composite the
>> default
>> > > >index size will be calculated as the sum of sizes of all indexed
>> > columns.
>> > > >For example, for the index like (INT, VARCHAR, VARCHAR, INT) default
>> > > inline
>> > > >size will be 5 + 10 + 10 + 5 = 30 (5 for each int, 10 for each
>> string).
>> > >
>> > > Why exactly this approach ? Why not 5 + 10 and its all here ? Do you
>> have
>> > > some logical base, statistical distribution or something near it, for
>> now
>> > > this look as your own decision and nothing more, i`m wrong ?
>> > > >
>> > > >2) For sql varchar and binary columns with defined length (for example
>> > > >VARCHAR(XX)) use XX + 3 as default inline size for the column (need 3
>> > > extra
>> > > >bytes for the inner representation of the type).
>> > >
>> > > The same question here, why you want o cover all varchar len ? do you
>> > > compare with other vendors approach ?
>> > > >
>> > > >3) Maximum default index size still will be limited by
>> > > >IGNITE_MAX_INDEX_PAYLOAD_SIZE, but its default value will be increased
>> > to
>> > > >64. For example for the index (VARCHAR, VARCHAR, VARCHAR, VARCHAR,
>> > > VARCHAR,
>> > > >VARCHAR, VARCHAR) default index size will be only 64. Same for the
>> > columns
>> > > >with defined length: by default VARCHAR(100) column will create index
>> > only
>> > > >with size equal to 64.
>> > > >
>> > > >Please tell if you have any concerns. Update can be found at
>> > > > https://github.com/apache/ignite/pull/8161
>> > > >
>> > > >Best regards,
>> > > >Evgeniy
>> > > >
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>> >
>> > --
>> > Vladislav Pyatkov
>> >
>>
>>
>> --
>> Best regards,
>> Evgeniy
>>