You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/04/15 16:15:00 UTC

[jira] [Created] (IMPALA-9660) Distributed codegen

Tim Armstrong created IMPALA-9660:
-------------------------------------

             Summary: Distributed codegen
                 Key: IMPALA-9660
                 URL: https://issues.apache.org/jira/browse/IMPALA-9660
             Project: IMPALA
          Issue Type: Improvement
          Components: Distributed Exec
            Reporter: Tim Armstrong


Another potential extension of IMPALA-5444 is that we can distribute the codegen work of different fragments across different backends. Today, each fragment will generate the same code on each backend server it's assigned to run on. This is mostly redundant work (except for scan nodes if different scan ranges correspond to different file formats). It would be great to consolidate the code generation work items among the backend servers and avoids redundant work. The codegen for a fragment (or an exec node if we allow ourselves to use multiple LLVM modules per fragment so as to allow parallel codegen for different exec nodes in a fragment) could be assigned to backend servers and the compiled code can be shipped to the backend Impalad servers when it's ready. Of course, this may involve some security issues as we have to trust the binary being shipped over. We may also need to take into account of the latency for shipping the code. However, this is potentially a huge saving in CPUs for queries with many fragments running on a huge cluster.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org