You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Csaba Ringhofer (Jira)" <ji...@apache.org> on 2023/09/07 12:46:00 UTC
[jira] [Created] (IMPALA-12430) Optimize sending rows within the same process
Csaba Ringhofer created IMPALA-12430:
----------------------------------------
Summary: Optimize sending rows within the same process
Key: IMPALA-12430
URL: https://issues.apache.org/jira/browse/IMPALA-12430
Project: IMPALA
Issue Type: Improvement
Components: Backend
Reporter: Csaba Ringhofer
Currently sending row batches to exchange nodes always goes through KRPC even if the sender and receiver are within the same process.
This means that the following work is done without actually being necessary:
sender:
1. serialize RowBatch to a single buffer
2. compress the buffer with LZ4
3. send the buffer as a sidecar in KRPC
receiver:
4. fetch buffer from KRPC
5. decompress the buffer
6. convert the buffer to RowBatch
Ideally a single deep copy from the sender's RowBatch to the destination's RowBatch is enough (this is needed to cleanup the memory referenced in the original RowBatch during send).
The most expensive part is 2, the compression with LZ4 (decompression is much faster) and can be avoided with minimal changes.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)