You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Andrey Mashenkov <an...@gmail.com> on 2021/06/04 13:27:45 UTC

Re: IEP-54: Schema-first approach for 3.0

Hi Igniters,

I and Alex Scherbakov had a discussion on how we could write rows in a more
compact way.
Many thanks to Alex for his ideas and critics.
So, in a long-read below I want to share some thoughts.

Motivation.

In Ignite 3.0 we will have versioned schema and most of the meta info will
be stored in the schema.
This approach gives lesser overheard on row size comparing to BinatyObject
in Ignite 2.0,
but I see we can still save up to 5-25% in some use cases that look
promising.

Apparently, there always is a trade-off between row footprint, field access
performance, and code complexity.
We won't fight to the death for every single byte, but for a relatively low
row footprint overhead.
but still can have different formats/technics for writing compact meta
(sizes, offsets ...) to cover common use cases.

Description.

Ignite table/index operations (but not only them) performance correlates
with the key size.
Because of this, we recommend having the smallest keys as possible, and
small keys like long, UUID, or short strings are widely used.

Value size may differ and depends on the use case. AFAIK some user needs
MB+ sized values.
I don't know if any corner cases take place in production, such as 100+
varlen short columns (especially short) or huge values, or keys > 64kb.
and if they are relevant to Ignite goals and target auditory.

Below I use the term 'chunk' meaning a key or value byte sequence.

Points and technics to save few bytes:
* Chunk size.
If the key is a single long value then using a single byte instead of short
may reduce overhead twice (25% -> 12%).

* Vartable item size (varlen column offset or varlen column size).
Here we can save noticeable amount of byte if the user has many short
varlen columns.
E.g. 10 short strings of 10 bytes (100 in total) can save 10%.

* Using varlen column sizes instead of offsets.
We can use items of 'byte' even if total chunk size do not fit into a byte.
E.g. if the user has 30 strings each of 10 chars (300 bytes in total) then
using 'sizes' of byte (instead of short) here we could save 10%.
This increases complexity (up to linear) to column offset calculation, but
I think we shouldn't bother about performance impact here.
Because CPU is cheap here: vartable items resides locally and in most cases
(32-64) varlen column sizes can fit into one cache line
and calculations can be effectively vectorized.

* Use 'varInt' format for sizes.
Shortly: VarInt format implies we use a sign bit as a flag if a data spans
over more bytes or not.
So, positive byte means byte value. A negative byte means value spans for
more bytes and we have to drop the sign bit and concat the rest 7-bits with
the next byte.
Thus, if a number fits a smaller type then we can use lesser bytes to store
it.
Total chunk size calculation may be a bit tricky

* Strings size precalculation.
The problem is we need to analyze characters to estimate string size before
start key/value serialization.
We can estimate sizes for long strings though, e.g. check symbol-by-symbol
for strings of 64-255, as char[63] will always fit byte[255] and char[256]
will never fit byte[255].
(with varInt format 32-127 bounds can be used).

There are other more restrictive ways:

* Varlen table (vartable) size of byte.
Does one need more than 255 varlen columns? E.g. Oracle has a limit of 1000
total columns.
Actually, the impact is low enough, we can save a byte per-varlen column.
Moreover, we already have optimization to skip the first varlen offset (or
last varlen length).
So, we will not write a vartable for a single varlen column in a chunk.

* Restrict varlen sizes to 64kb and introduce BLOB type for varlength >
64kb.
This allows excluding cases with items of 'int' in vartable. Therefore,
reduces the number of flags, chunk reader/writer implementations, and
overall code complexity.
I'd suggest discussing BLOB type in a separate thread and implements
separately.
Shortly, we can store BLOB 'uuid' in a row instead, and store BLOB content
in separate storage. RowAssembler can write row bytes and pairs
('uuid','content') separately to different arrays. The transport protocol
should be aware of BLOBs.

* Alex idea. We can have 2 varlen tables for small (len of < 255 bytes) and
large varlens (len of < 64k) with byte and short offsets correspondingly.
It is assumed varlen columns are sorted by their types (e.g. shorter first).
Thus can be effective if the user have a number of small varlens and a
larger one. The larger one will force us to use longer vartable items.
The drawback is a user must define max-length constraint for varlens
columns at a schema declaration step to turn on optimization for columns of
short types.
Because column order is defined in the schema, we can't resort to columns
for row in runtime and apply optimization for short values of long-type
columns.
E.g. user defines VARCHAR(1024) column in a schema and pass a short value
of 10 chars, we can't use first vartable item for that string as a second
vartable must be used with 10% overhead.

IMHO: user won't bother about String sizes (localize string size estimation
is hard) and will not get any benefits in many cases.
Life-schema concept which is designed for automatic schema management,
de-facto will become a semi-automatic, as every Ignite user desire a max
performance.

So, questions are:
1. Do we want to support varlen columns number >255?
2. Are we ok with the BLOB idea and max varlen size of 64kb?
3. Are we ok with varInt for chunk (and/or vartable) size?
4. What about the idea with 2 vartables? Will we prefer optimization based
on schema rather than row content?


Any other use cases or ideas or thoughts?

On Wed, May 26, 2021 at 3:00 PM StephanieSy <im...@gmail.com> wrote:

> That's the same case for me!. I've just downgraded my typescript version
> and
> everything starts working. How did you notice that the typescript's version
> was the problem?
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>


-- 
Best regards,
Andrey V. Mashenkov