You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Klaus Ackermann (JIRA)" <ji...@apache.org> on 2014/06/02 11:51:02 UTC

[jira] [Created] (PIG-3980) PigStorage with -tagFile options replaces first tupple always, even when not selecting it with foreach command

Klaus Ackermann created PIG-3980:
------------------------------------

             Summary: PigStorage with -tagFile options replaces first tupple always, even when not selecting it with foreach command
                 Key: PIG-3980
                 URL: https://issues.apache.org/jira/browse/PIG-3980
             Project: Pig
          Issue Type: Bug
          Components: parser
    Affects Versions: 0.12.1
         Environment: MacOs 10.7.5 Java 1.7 hadoop 2.2.0 
            Reporter: Klaus Ackermann


When specifying the -tagFile option in foreach iteration the first tuple gets always overwritten with the file-name. Even when the file name is not selected. In the following example instead of the date and dividend, the result contains the file name and the dividend.

Example:

divs   = load 'data.csv' using PigStorage(';','-tagFile') as (file:chararray, exchange:chararray, symbol:chararray, date:chararray, dividends:float);

subtable = foreach divs generate date as d, dividends as divs;
store subtable into 'sub_dividend';


Test Input data.csv:
NYSE;CPO;2009-12-30;0.14
NYSE;CPO;2009-01-06;0.14
NYSE;CCS;2009-10-28;0.414
NYSE;CCS;2009-01-28;0.414
NYSE;CIF;2009-12-09;0.029

PigUnit Test for it:
@Test
    public void testPigScript() throws IOException, ParseException {
        String[] script = {
                "divs   = load 'data.csv' using PigStorage(';','-tagFile') as (file:chararray, exchange:chararray, symbol:chararray, date:chararray, dividends:float);",
                 "B = foreach divs generate $0;",
                "subtable = foreach divs generate date as d, dividends as divs;",
                "store subtable into 'sub_dividend';",
        };
        PigTest test = new PigTest(script);

        String[] output = { "(2009-12-30,0.14)\n" +
                "(2009-01-06,0.14)\n" +
                "(2009-10-28,0.414)\n" +
                "(2009-01-28,0.414)\n" +
                "(2009-12-09,0.029)" };

        test.assertOutput("subtable",output);
    }






--
This message was sent by Atlassian JIRA
(v6.2#6252)