You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2011/03/30 02:32:05 UTC

[jira] [Commented] (PIG-1899) Pig needs a tool for doing end to end testing efficiently

    [ https://issues.apache.org/jira/browse/PIG-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012785#comment-13012785 ] 

Olga Natkovich commented on PIG-1899:
-------------------------------------

The patch looks good. A few comments:

(1) There are several scripts that are placed directly under pig/tools. I wonder if we have a test subdirectory under this
(2) It would be great for each file especially for scripts and UDFs to have a a little more information on what it does. For instance, generate_data.pl just says that it generates data but not what kind and what parameters it support.
(3) CreateMap.java TOMAP.java - there is already a TOMAP function in builtin which I think does something very similar
(4) UPPER.java, TOBAG.java  - these are also part of builtins
(5) pig/udfs/java/build.xml - not sure exactly what this is for but location is kind of strange and it also refers to HowlDriver.
(6) There are also one reference to yahoo that needs to be removed (just grep for yahoo)

> Pig needs a tool for doing end to end testing efficiently
> ---------------------------------------------------------
>
>                 Key: PIG-1899
>                 URL: https://issues.apache.org/jira/browse/PIG-1899
>             Project: Pig
>          Issue Type: Test
>          Components: tools
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: PIG-1899.patch, PIG-1899.patch, e2e.patch
>
>
> Pig currently uses junit for all testing.  junit is good for unit tests, but limited for end to end and integration testing.
> Building an end to end test in junit is cumbersome (a lot of setup and such to do using MiniCluster).  Given that expected results must be known beforehand and hand crafted they must be kept very small, usually ten or less rows.  This does not lead to realistic testing scenarios.
> A test tool is needed that allows the test developer to write a Pig Latin script and specify a source of truth against which to test the results of running this Pig Latin script.  A database or a previous version of Pig can then be used as that source of truth.  This will allow developers to quickly add new tests that return more than trivial results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira