You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Csaba Ringhofer (Jira)" <ji...@apache.org> on 2023/09/07 12:46:00 UTC

[jira] [Created] (IMPALA-12430) Optimize sending rows within the same process

Csaba Ringhofer created IMPALA-12430:
----------------------------------------

             Summary: Optimize sending rows within the same process
                 Key: IMPALA-12430
                 URL: https://issues.apache.org/jira/browse/IMPALA-12430
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
            Reporter: Csaba Ringhofer


Currently sending row batches to exchange nodes always goes through KRPC even if the sender and receiver are within the same process.

This means that the following work is done without actually being necessary:

sender:
1. serialize RowBatch to a single buffer
2. compress the buffer with LZ4
3. send the buffer as a sidecar in KRPC
receiver:
4. fetch buffer from KRPC
5. decompress the buffer
6. convert the buffer to RowBatch

Ideally a single deep copy from the sender's RowBatch to the destination's RowBatch is enough (this is needed to cleanup the memory referenced in the original RowBatch during send).

The most expensive part is 2, the compression with LZ4 (decompression is much faster) and can be avoided with minimal changes.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)