You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/02/02 04:25:59 UTC

[jira] Commented: (HIVE-34) Make DynamicSerDe capable of skipping fields that will not be used in the query

    [ https://issues.apache.org/jira/browse/HIVE-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669500#action_12669500 ] 

Zheng Shao commented on HIVE-34:
--------------------------------

A recent performance study from Rodrigo showed that creating new String objects for each column in each row is a big performance overhead.
We might want to do lazy initialization to get rid of the cost of creating new String objects (or use modified Text class).

> Make DynamicSerDe capable of skipping fields that will not be used in the query
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-34
>                 URL: https://issues.apache.org/jira/browse/HIVE-34
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Pete Wyckoff
>
> Thrift/DynamicSerDe always deseriualize and convert fields to the correct type for every field in the record. Many times, only a few of the fields will be used.
> e.g., select foo.user from foo where foo.created < 'today'
> where foo is something like
> struct {
>   string user
>    i64 created
>    string fullname
>    string description
>     i32 something
>     i32 somethingelse
>    ...
> }
> Parsing fullname, description, something and something else is a waste in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.