You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2019/04/17 16:24:00 UTC
[jira] [Commented] (IMPALA-8431) Parquet STRING column memory reservation seems underestimated

    [ https://issues.apache.org/jira/browse/IMPALA-8431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16820265#comment-16820265 ] 

Tim Armstrong commented on IMPALA-8431:
---------------------------------------

FWIW this would mainly have a perf impact if memory was constrained during scanning - the scanner will always ask for more memory if it looks at the file layout and sees that the "ideal" memory to scan the file is greater than the minimum. The reservations are by design only based on the on-disk size of the data, since they're for sizing I/O buffer. So it's correct to ignore the StringValue representation, but indeed it doesn't factor in overhead like string lengths. It also doesn't factor in the null bits, but I think it's reasonable to assume that those compress very well. There are inaccuracies in the other direction too - the heuristics don't account well for general-purpose compression of string values, which can dramatically shrink the data. It would be interesting to see how accurate the estimates are for "real" data, although I'm not sure how important it is right now.

The good thing is that the backend support is design to operate correctly with any size of I/O buffers, and to always ask for more memory at runtime if it could benefit from them. The minimum reservation is meant to avoid poor performance because of the scanner being starved for I/O buffers, without over-reserving memory that the scanner won't be able to use. E.g. imagine that the scanner has to scan a 100MB column but only has 64kb of I/O buffers - that's a lot of I/Os.

Another assumption is that the I/O buffers are the biggest factor in memory consumption for scans and the in-memory RowBatches are smaller in comparison because we only materialise 1024 rows at a time. There's some additional heuristics in HdfsScanNode to try and account for this overhead when spinning up more scanner threads. It's not ideal but it works fairly well.

> Parquet STRING column memory reservation seems underestimated
> -------------------------------------------------------------
>
>                 Key: IMPALA-8431
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8431
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.2.0
>            Reporter: Csaba Ringhofer
>            Priority: Minor
>              Labels: parquet, reservation
>
> https://github.com/apache/impala/blob/5fa076e95cfbfcc044dc14cbb20af825936af82a/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java#L1698
> computeMinScalarColumnMemReservation() uses stat avg_size to estimate the memory needed for a value during scanning, but this does not contain the 4 byte / value length field used in plain encoding, which can dominate columns with very short strings. (compression can probably negate this affect)
> In case of dict decoding estimation:
> - this 4 byte/NDV should be also added, as the dictionary itself is also plain encoded
> - the backend used + 12 byte/NDV for the StringValues used as indirection in the dictionary, but I am not sure if this should be added to the reservation
> - a more pessimistic estimation would use max_size instead of avg_size  for dictionary entries, as it is possible that the majority of distinct values are long, but the short ones are much more frequent, which makes the avg_size small
> Another small underestimation, that NULL values are ignored. NULLs (=def levels) could be  added as 1 bit/value.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org