You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Chen Luo (Jira)" <ji...@apache.org> on 2020/09/28 19:26:00 UTC

[jira] [Assigned] (ASTERIXDB-2784) Join memory requirement for large objects

     [ https://issues.apache.org/jira/browse/ASTERIXDB-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chen Luo reassigned ASTERIXDB-2784:
-----------------------------------

    Assignee: Shiva Jahangiri

> Join memory requirement for large objects
> -----------------------------------------
>
>                 Key: ASTERIXDB-2784
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2784
>             Project: Apache AsterixDB
>          Issue Type: Improvement
>          Components: COMP - Compiler, RT - Runtime
>            Reporter: Chen Luo
>            Assignee: Shiva Jahangiri
>            Priority: Major
>
> Currently the compiler assumes the minimum number of join frames is 5 [1]. However, this does not guarantee a join will always succeed in case of large objects. The actual join memory requirement is actually MAX(5, #partitions * #large object size). The reason is that in the spill policy [2], we only spill a partition if it hasn't been spilled before. As a result, when we are writing to an empty partition, it is possible that each of other partitions has one large object (which could be larger than the frame size) but no partition can be spilled. Thus, the join memory requirement becomes #partitions * #large object size in this case.
> [1] [https://github.com/apache/asterixdb/blob/master/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/AbstractJoinPOperator.java#L29)|https://github.com/apache/asterixdb/blob/master/hyracks-fullstack/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/AbstractJoinPOperator.java#L29).]
> [2] https://github.com/apache/asterixdb/blob/37dfed60fb47afcc86de6d17704a8f100217057d/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/buffermanager/PreferToSpillFullyOccupiedFramePolicy.java#L55



--
This message was sent by Atlassian Jira
(v8.3.4#803005)