You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2022/02/05 01:31:00 UTC
[jira] [Commented] (IMPALA-11104) Revisit computeMinScalarColumnMemReservation for ORC async IO
[ https://issues.apache.org/jira/browse/IMPALA-11104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17487392#comment-17487392 ]
Quanlong Huang commented on IMPALA-11104:
-----------------------------------------
Thanks for filing this!
FWIW, one difference is that in Parquet all column types can be dictionary encoded, whereas in ORC only string types can be dictionary encoded.
> Revisit computeMinScalarColumnMemReservation for ORC async IO
> -------------------------------------------------------------
>
> Key: IMPALA-11104
> URL: https://issues.apache.org/jira/browse/IMPALA-11104
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Riza Suminto
> Priority: Major
>
> HdfsScanNode.computeMinScalarColumnMemReservation has estimate to reduce memory reservation for a column lower than DEFAULT_COLUMN_SCAN_RANGE_RESERVATION (4MB).
> [https://github.com/apache/impala/blob/df528fe/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java#L2226-L2235]
>
> This estimate is based on Parquet table. We need to revisit this estimate for ORC table.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org