You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2015/06/30 23:35:05 UTC

[jira] [Comment Edited] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

    [ https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609116#comment-14609116 ] 

Sergey Shelukhin edited comment on HIVE-10940 at 6/30/15 9:34 PM:
------------------------------------------------------------------

{noformat}
// lets take a look at the operator memory requirements.
{noformat}
this comment looks like it was c/p-ed.

Can you add a comment to where the new optimizer is added indicating that it should be added at the end (for people who will be adding more optimizers)?

serializedFilterObject is never set anymore. Set or remove?


was (Author: sershe):
{noformat}
// lets take a look at the operator memory requirements.
{noformat}
this comment seems like it was c/p-ed.

Can you add comment to where the new optimizer is added indicating that it should run last?

serializedFilterObject is never set anymore. Set or remove?

> HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-10940
>                 URL: https://issues.apache.org/jira/browse/HIVE-10940
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats
>    Affects Versions: 1.2.0
>            Reporter: Gopal V
>            Assignee: Sergey Shelukhin
>             Fix For: 2.0.0
>
>         Attachments: HIVE-10940.01.patch, HIVE-10940.02.patch, HIVE-10940.patch
>
>
> {code}
>     String filterText = filterExpr.getExprString();
>     String filterExprSerialized = Utilities.serializeExpression(filterExpr);
> {code}
> the serializeExpression initializes Kryo and produces a new packed object for every split.
> HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters.
> And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)