You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hive.apache.org by "slim bouguerra (JIRA)" <ji...@apache.org> on 2018/10/25 16:19:00 UTC

[jira] [Commented] (HIVE-20486) Kafka: Use Row SerDe + vectorization

    [ https://issues.apache.org/jira/browse/HIVE-20486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16663962#comment-16663962 ] 

slim bouguerra commented on HIVE-20486:
---------------------------------------

looked at the suggested option and the code base https://github.com/apache/hive/blob/37c7fd7833eba087eadd8048dbc63b403b272104/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java#L1462

i see that it is only possible to enable this for some pre selected Input formats.
Thus i have created a new vectorized Kafka Reader.



> Kafka: Use Row SerDe + vectorization
> ------------------------------------
>
>                 Key: HIVE-20486
>                 URL: https://issues.apache.org/jira/browse/HIVE-20486
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Gopal V
>            Priority: Major
>
> KafkaHandler returns unvectorized rows which causes the operators downstream to be slower and sub-optimal.
> Hive has a vectorization shim which allows Kafka streams without complex projections to be wrapped into a vectorized reader via {{hive.vectorized.use.row.serde.deserialize}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)