You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2015/04/02 00:12:53 UTC

[jira] [Assigned] (PIG-4483) Pig on Tez output statistics shows storing to same directory twice for union

     [ https://issues.apache.org/jira/browse/PIG-4483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Rohini Palaniswamy reassigned PIG-4483:
---------------------------------------

    Assignee: Rohini Palaniswamy

> Pig on Tez output statistics shows storing to same directory twice for union
> ----------------------------------------------------------------------------
>
>                 Key: PIG-4483
>                 URL: https://issues.apache.org/jira/browse/PIG-4483
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Rohini Palaniswamy
>              Labels: newbie
>
> For the below script
> A = LOAD 'data1';
> B = LOAD 'data2';
> C = UNION A, B;
> STORE C into 'data3';
> Output message is shown as below due to vertex group and storing from separate vertices.
> Successfully stored 10 records (xxx bytes) in: "data3"
> Successfully stored 20 records (yyy bytes) in: "data3"
> Even though it is correct it can be confusing for users and they have to sum it up before comparing to Pig on MR output message. OutputStats with same filename should be combined and shown as
> Successfully stored 30 records (xxx bytes) in: "data3"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)