You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2009/05/21 03:31:45 UTC
[jira] Resolved: (PIG-812) COUNT(*) does not work
[ https://issues.apache.org/jira/browse/PIG-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich resolved PIG-812.
--------------------------------
Resolution: Won't Fix
The fact that this worked in earlier code was a bug. Now pig has a consistent implementation of *.
>From the beginning, Pig chose a different semantics for * than SQL. (it is unfortunate that we chose to use "*" for this but is something we need to leave in for consistency and backward compatibility.)
"*" in SQL means a relation while in pig it means a tuple passed to an operator. So in Pig you can order on the entire row by saying
B = order A by *;
You can also pass the entire row to a UDF by doing myUDF(*).
In this context COUNT(*) makes no sense.
> COUNT(*) does not work
> -----------------------
>
> Key: PIG-812
> URL: https://issues.apache.org/jira/browse/PIG-812
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.2.0
> Reporter: Viraj Bhat
> Priority: Critical
> Fix For: 0.2.0
>
> Attachments: studenttab10k
>
>
> Pig script to count the number of rows in a studenttab10k file which contains 10k records.
> {code}
> studenttab = LOAD 'studenttab10k' AS (name:chararray, age:int,gpa:float);
> X2 = GROUP studenttab ALL;
> describe X2;
> Y2 = FOREACH X2 GENERATE COUNT(*);
> explain Y2;
> DUMP Y2;
> {code}
> returns the following error
> ================================================================
> ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias Y2
> Details at logfile: /homes/viraj/pig-svn/trunk/pig_1242783700970.log
> ================================================================
> If you look at the log file:
> ================================================================
> Caused by: java.lang.ClassCastException
> at org.apache.pig.builtin.COUNT$Initial.exec(COUNT.java:76)
> at org.apache.pig.builtin.COUNT$Initial.exec(COUNT.java:68)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:235)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:254)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:231)
> at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:223)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:245)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:236)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:88)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
> ================================================================
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.