You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2014/04/23 04:14:18 UTC

[jira] [Commented] (TEZ-1081) expose some basic statistics from org.apache.tez.runtime.api.Input (or similar)

    [ https://issues.apache.org/jira/browse/TEZ-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977740#comment-13977740 ] 

Sergey Shelukhin commented on TEZ-1081:
---------------------------------------

[~sseth] [~t3rmin4t0r] fyi

> expose some basic statistics from org.apache.tez.runtime.api.Input (or similar)
> -------------------------------------------------------------------------------
>
>                 Key: TEZ-1081
>                 URL: https://issues.apache.org/jira/browse/TEZ-1081
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>
> Hive loads data from  org.apache.tez.runtime.api.Input into mapjoin hashtables. It would be useful to know in advance
> 1) How many rows are there in the input (should be easy to add).
> 2) How many unique keys (even an approximation).



--
This message was sent by Atlassian JIRA
(v6.2#6252)