You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Jason Dere (JIRA)" <ji...@apache.org> on 2018/04/12 21:38:00 UTC

[jira] [Commented] (HIVE-18252) Limit the size of the object inspector caches

    [ https://issues.apache.org/jira/browse/HIVE-18252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16436343#comment-16436343 ] 

Jason Dere commented on HIVE-18252:
-----------------------------------

Going to try the following approach:
- Remove caching for the complex object inspectors (list/map/union/struct).
- This requires implementing equals()/hashcode() for the complex object inspectors, as well as for the constant object inspectors.

> Limit the size of the object inspector caches
> ---------------------------------------------
>
>                 Key: HIVE-18252
>                 URL: https://issues.apache.org/jira/browse/HIVE-18252
>             Project: Hive
>          Issue Type: Bug
>          Components: Types
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>            Priority: Major
>         Attachments: HIVE-18252.1.patch
>
>
> Was running some tests that had a lot of queries with constant values, and noticed that ObjectInspectorFactory.cachedStandardStructObjectInspector started using up a lot of memory.
> It appears that StructObjectInspector caching does not work properly with constant values. Constant ObjectInspectors are not cached, so each constant expression creates a new constant ObjectInspector. And since object inspectors do not override equals(), object inspector comparison relies on object instance comparison. So even if the values are exactly the same as what is already in the cache, the StructObjectInspector cache lookup would fail, and Hive would create a new object inspector and add it to the cache, creating another entry that would never be used. Plus, there is no max cache size - it's just a map that is allowed to grow as long as values keep getting added to it.
> Some possible solutions I can think of:
> 1. Limit the size of the object inspector caches, rather than growing without bound.
> 2. Try to fix the caching to work with constant values. This would require implementing equals() on the constant object inspectors (which could be slow in nested cases), or else we would have to start caching constant object inspectors, which could be expensive in terms of memory usage. Could be used in combination with (1). By itself this is not a great solution because this still has the unbounded cache growth issue.
> 3. Disable caching in the case of constant object inspectors since this scenario currently doesn't work. This could be used in combination with (1).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)