You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/02/02 04:25:59 UTC
[jira] Commented: (HIVE-34) Make DynamicSerDe capable of skipping
fields that will not be used in the query
[ https://issues.apache.org/jira/browse/HIVE-34?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669500#action_12669500 ]
Zheng Shao commented on HIVE-34:
--------------------------------
A recent performance study from Rodrigo showed that creating new String objects for each column in each row is a big performance overhead.
We might want to do lazy initialization to get rid of the cost of creating new String objects (or use modified Text class).
> Make DynamicSerDe capable of skipping fields that will not be used in the query
> -------------------------------------------------------------------------------
>
> Key: HIVE-34
> URL: https://issues.apache.org/jira/browse/HIVE-34
> Project: Hadoop Hive
> Issue Type: Bug
> Components: Serializers/Deserializers
> Reporter: Pete Wyckoff
>
> Thrift/DynamicSerDe always deseriualize and convert fields to the correct type for every field in the record. Many times, only a few of the fields will be used.
> e.g., select foo.user from foo where foo.created < 'today'
> where foo is something like
> struct {
> string user
> i64 created
> string fullname
> string description
> i32 something
> i32 somethingelse
> ...
> }
> Parsing fullname, description, something and something else is a waste in this case.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.