You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Joe Gutierrez <jo...@brightroll.com> on 2012/02/22 00:21:37 UTC

ERROR 1066: Unable to open iterator for alias

Hi all,

I am trying to run one of the examples from the pig 0.9.2 distribution and
am running into the following exception:


    [junit] Unable to open iterator for alias queries_limit
    [junit] org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066:
Unable to open iterator for alias queries_limit
    [junit] at org.apache.pig.PigServer.openIterator(PigServer.java:901)
    [junit] at org.apache.pig.pigunit.PigTest.getAlias(PigTest.java:183)
    [junit] at TopQueriesTest.testTop2Queries(TopQueriesTest.java:32)
    [junit] Caused by: java.io.IOException: Couldn't retrieve job.
    [junit] at org.apache.pig.PigServer.store(PigServer.java:965)
    [junit] at org.apache.pig.PigServer.openIterator(PigServer.java:876)

Below is my pig script:

data = LOAD '$input' AS (query:CHARARRAY, count:INT);
queries_group = GROUP data BY query;
queries_sum = FOREACH queries_group GENERATE group AS query,
SUM(data.count) AS count;
queries_ordered = ORDER queries_sum BY count DESC;
queries_limit = LIMIT queries_ordered $limit_count;
STORE queries_limit INTO '$output';

and below is my input data:

yahoo 10
twitter 7
facebook 10
yahoo 15
facebook 5

I've included my pig unit code as well:

public void testQueries()
{
    String[] args = {
        "limit_count=3",
        "input=<path_to_input_file>",
        "output=<path_to_output_file>",
    };

    test = new PigTest(<path_to_pig_script>, args);

    String[] output = {
        "(yahoo,25)",
        "(facebook,15)",
        "(twitter,7)",
    };

    test.assertOutput("queries_limit", output);

}

When I execute the test, I receive the exception above. When I alter the
unit test to assert on queries_sum, I receive the correct values. However,
when I assert on queries_ordered or queries_limit, I receive the exception.
How do I resolve this issue? Thanks in advance.

-- 
Joe Gutierrez :: Software Developer

Re: ERROR 1066: Unable to open iterator for alias

Posted by praveenesh kumar <pr...@gmail.com>.
Sometimes I got this error, when my data is not proper. Please check your
data again, whether its properly tab separated or not ?

Thanks,
Praveenesh

On Wed, Feb 22, 2012 at 4:45 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Hi joe, you get this error when one of the component mapreduce jobs dies.
> Try going to the job tracker and seeing if there are failed jobs -- you'll
> be able to get the logs of individual tasks that might contain the actual
> error.
>
> On Feb 21, 2012, at 3:21 PM, Joe Gutierrez <jo...@brightroll.com> wrote:
>
> > Hi all,
> >
> > I am trying to run one of the examples from the pig 0.9.2 distribution
> and
> > am running into the following exception:
> >
> >
> >    [junit] Unable to open iterator for alias queries_limit
> >    [junit] org.apache.pig.impl.logicalLayer.FrontendException: ERROR
> 1066:
> > Unable to open iterator for alias queries_limit
> >    [junit] at org.apache.pig.PigServer.openIterator(PigServer.java:901)
> >    [junit] at org.apache.pig.pigunit.PigTest.getAlias(PigTest.java:183)
> >    [junit] at TopQueriesTest.testTop2Queries(TopQueriesTest.java:32)
> >    [junit] Caused by: java.io.IOException: Couldn't retrieve job.
> >    [junit] at org.apache.pig.PigServer.store(PigServer.java:965)
> >    [junit] at org.apache.pig.PigServer.openIterator(PigServer.java:876)
> >
> > Below is my pig script:
> >
> > data = LOAD '$input' AS (query:CHARARRAY, count:INT);
> > queries_group = GROUP data BY query;
> > queries_sum = FOREACH queries_group GENERATE group AS query,
> > SUM(data.count) AS count;
> > queries_ordered = ORDER queries_sum BY count DESC;
> > queries_limit = LIMIT queries_ordered $limit_count;
> > STORE queries_limit INTO '$output';
> >
> > and below is my input data:
> >
> > yahoo 10
> > twitter 7
> > facebook 10
> > yahoo 15
> > facebook 5
> >
> > I've included my pig unit code as well:
> >
> > public void testQueries()
> > {
> >    String[] args = {
> >        "limit_count=3",
> >        "input=<path_to_input_file>",
> >        "output=<path_to_output_file>",
> >    };
> >
> >    test = new PigTest(<path_to_pig_script>, args);
> >
> >    String[] output = {
> >        "(yahoo,25)",
> >        "(facebook,15)",
> >        "(twitter,7)",
> >    };
> >
> >    test.assertOutput("queries_limit", output);
> >
> > }
> >
> > When I execute the test, I receive the exception above. When I alter the
> > unit test to assert on queries_sum, I receive the correct values.
> However,
> > when I assert on queries_ordered or queries_limit, I receive the
> exception.
> > How do I resolve this issue? Thanks in advance.
> >
> > --
> > Joe Gutierrez :: Software Developer
>

Re: ERROR 1066: Unable to open iterator for alias

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Hi joe, you get this error when one of the component mapreduce jobs dies. Try going to the job tracker and seeing if there are failed jobs -- you'll be able to get the logs of individual tasks that might contain the actual error. 

On Feb 21, 2012, at 3:21 PM, Joe Gutierrez <jo...@brightroll.com> wrote:

> Hi all,
> 
> I am trying to run one of the examples from the pig 0.9.2 distribution and
> am running into the following exception:
> 
> 
>    [junit] Unable to open iterator for alias queries_limit
>    [junit] org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066:
> Unable to open iterator for alias queries_limit
>    [junit] at org.apache.pig.PigServer.openIterator(PigServer.java:901)
>    [junit] at org.apache.pig.pigunit.PigTest.getAlias(PigTest.java:183)
>    [junit] at TopQueriesTest.testTop2Queries(TopQueriesTest.java:32)
>    [junit] Caused by: java.io.IOException: Couldn't retrieve job.
>    [junit] at org.apache.pig.PigServer.store(PigServer.java:965)
>    [junit] at org.apache.pig.PigServer.openIterator(PigServer.java:876)
> 
> Below is my pig script:
> 
> data = LOAD '$input' AS (query:CHARARRAY, count:INT);
> queries_group = GROUP data BY query;
> queries_sum = FOREACH queries_group GENERATE group AS query,
> SUM(data.count) AS count;
> queries_ordered = ORDER queries_sum BY count DESC;
> queries_limit = LIMIT queries_ordered $limit_count;
> STORE queries_limit INTO '$output';
> 
> and below is my input data:
> 
> yahoo 10
> twitter 7
> facebook 10
> yahoo 15
> facebook 5
> 
> I've included my pig unit code as well:
> 
> public void testQueries()
> {
>    String[] args = {
>        "limit_count=3",
>        "input=<path_to_input_file>",
>        "output=<path_to_output_file>",
>    };
> 
>    test = new PigTest(<path_to_pig_script>, args);
> 
>    String[] output = {
>        "(yahoo,25)",
>        "(facebook,15)",
>        "(twitter,7)",
>    };
> 
>    test.assertOutput("queries_limit", output);
> 
> }
> 
> When I execute the test, I receive the exception above. When I alter the
> unit test to assert on queries_sum, I receive the correct values. However,
> when I assert on queries_ordered or queries_limit, I receive the exception.
> How do I resolve this issue? Thanks in advance.
> 
> -- 
> Joe Gutierrez :: Software Developer