You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Vaibhav Gumashta (JIRA)" <ji...@apache.org> on 2016/05/18 22:45:13 UTC

[jira] [Updated] (HIVE-13770) Improve Thrift result set streaming when serializing thrift ResultSets in tasks

     [ https://issues.apache.org/jira/browse/HIVE-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vaibhav Gumashta updated HIVE-13770:
------------------------------------
    Issue Type: Sub-task  (was: Improvement)
        Parent: HIVE-12427

> Improve Thrift result set streaming when serializing thrift ResultSets in tasks
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-13770
>                 URL: https://issues.apache.org/jira/browse/HIVE-13770
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Holman Lan
>
> When serializing the Thrift result set in final task, i.e. the hive.server2.thrift.resultset.serialize.in.tasks property is set to true, HS2 does not start sending the results until the entire result set has been written to HDFS.
> This is not efficient and we should find a way for HS2 to start sending the results as soon as a block of result becomes available. The advantage for this is two folds. One, the client can start consuming the results much sooner. Two, we can start reclaiming the storage space in HDFS used by a particular result set block as soon as the result set block has been successfully sent to the client.
> It's worth checking if this is also the case when not serializing the Thrift result set in final task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)