You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Xuefu Zhang (JIRA)" <ji...@apache.org> on 2015/07/21 05:39:04 UTC

[jira] [Commented] (PIG-4634) Fix records count issues in output statistics

    [ https://issues.apache.org/jira/browse/PIG-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634479#comment-14634479 ] 

Xuefu Zhang commented on PIG-4634:
----------------------------------

Patch looks good. Two questions:

1. In getRecordCount(), we throw runtime exception if the job metrics is null. This might fail the job. Is that too harsh?
2. We are aggregating records in different stages. Then, what does record count in means exactly?
3. Is there a way to test this?

> Fix records count issues in output statistics
> ---------------------------------------------
>
>                 Key: PIG-4634
>                 URL: https://issues.apache.org/jira/browse/PIG-4634
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: kexianda
>            Assignee: kexianda
>             Fix For: spark-branch
>
>         Attachments: PIG-4634.patch
>
>
> Test cases simpleTest() and simpleTest2()  in TestPigRunner failed, caused by following issues:
> 1. pig context in SparkPigStats isn't initialized.
> 2. the records count logic hasn't been implemented.
> 3. getOutpugAlias(), getPigProperties(), getBytesWritten() and getRecordWritten() have not been implemented.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)