You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Graham Lea (JIRA)" <ji...@apache.org> on 2011/09/26 03:43:26 UTC
[jira] [Created] (PIG-2303) Documentation says the MAX function can
be used on chararray, but it can't
Documentation says the MAX function can be used on chararray, but it can't
--------------------------------------------------------------------------
Key: PIG-2303
URL: https://issues.apache.org/jira/browse/PIG-2303
Project: Pig
Issue Type: Bug
Components: documentation
Reporter: Graham Lea
Priority: Trivial
Here: http://pig.apache.org/docs/r0.9.0/func.html#max
It says MIN/Max can be used on chararray, but the result of those functions is always a double.
Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2303) FOREACH doesn't allow explicit schema
for MAX() result to be chararray
Posted by "Graham Lea (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Graham Lea updated PIG-2303:
----------------------------
Affects Version/s: (was: 0.9.0)
0.8.1
Fix Version/s: 0.9.0
Oh dear.
I'm using the latest Cloudera distribution, which is based on 0.8.1.
Sorry!
> FOREACH doesn't allow explicit schema for MAX() result to be chararray
> ----------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.8.1
> Reporter: Graham Lea
> Priority: Trivial
> Fix For: 0.9.0
>
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-2303) Documentation says the MAX function
can be used on chararray, but it can't
Posted by "Daniel Dai (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai resolved PIG-2303.
-----------------------------
Resolution: Invalid
> Documentation says the MAX function can be used on chararray, but it can't
> --------------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: documentation
> Reporter: Graham Lea
> Priority: Trivial
> Labels: documentation
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2303) FOREACH doesn't allow explicit schema
for MAX() result to be chararray
Posted by "Daniel Dai (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115098#comment-13115098 ]
Daniel Dai commented on PIG-2303:
---------------------------------
I still didn't see. I run this with success:
{code}
A = LOAD 'student.txt' USING PigStorage(',') AS (name:chararray, age:int, gpa:chararray);
B = group A by age;
C = foreach B generate MAX(A.name) as maxName: chararray;
explain C;
{code}
describe C gives me:
C: {maxName: chararray}
I'm using Pig 0.9.0.
> FOREACH doesn't allow explicit schema for MAX() result to be chararray
> ----------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.9.0
> Reporter: Graham Lea
> Priority: Trivial
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2303) Documentation says the MAX function
can be used on chararray, but it can't
Posted by "Graham Lea (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114431#comment-13114431 ]
Graham Lea commented on PIG-2303:
---------------------------------
Note: The error I was getting on entering the line using MIN + MAX into Grunt was:
{code}
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: double. Other Field Schema: fieldName: chararray
{code}
(Just adding this to help people searching for this error message to find a solution.)
> Documentation says the MAX function can be used on chararray, but it can't
> --------------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: documentation
> Reporter: Graham Lea
> Priority: Trivial
> Labels: documentation
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2303) FOREACH doesn't allow explicit schema
for MAX() result to be chararray
Posted by "Graham Lea (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Graham Lea updated PIG-2303:
----------------------------
Component/s: (was: documentation)
parser
Affects Version/s: 0.9.0
Labels: (was: documentation)
Summary: FOREACH doesn't allow explicit schema for MAX() result to be chararray (was: Documentation says the MAX function can be used on chararray, but it can't)
> FOREACH doesn't allow explicit schema for MAX() result to be chararray
> ----------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.9.0
> Reporter: Graham Lea
> Priority: Trivial
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2303) Documentation says the MAX function
can be used on chararray, but it can't
Posted by "Daniel Dai (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115042#comment-13115042 ]
Daniel Dai commented on PIG-2303:
---------------------------------
Hi, Graham, I ran the following script with success:
{code}
A = LOAD 'student.txt' AS (name:chararray, age:int, gpa:chararray);
B = group A by age;
C = foreach B generate MAX(A.name);
dump C;
{code}
I believe your script fail for a different reason. Can you post your script?
> Documentation says the MAX function can be used on chararray, but it can't
> --------------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: documentation
> Reporter: Graham Lea
> Priority: Trivial
> Labels: documentation
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-2303) FOREACH doesn't allow explicit schema
for MAX() result to be chararray
Posted by "Graham Lea (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Graham Lea resolved PIG-2303.
-----------------------------
Resolution: Cannot Reproduce
> FOREACH doesn't allow explicit schema for MAX() result to be chararray
> ----------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.8.1
> Reporter: Graham Lea
> Priority: Trivial
> Fix For: 0.9.0
>
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2303) Documentation says the MAX function
can be used on chararray, but it can't
Posted by "Graham Lea (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115014#comment-13115014 ]
Graham Lea commented on PIG-2303:
---------------------------------
>From the log file:
{noformat}
Pig Stack Trace
---------------
ERROR 1022: Type mismatch merging schema prefix. Field Schema: double. Other Field Schema: firstMonth: chararray
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during parsing. Type mismatch merging schema prefix. Field Schema: double. Other Field Schema: firstMonth: chararray
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1618)
at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1562)
at org.apache.pig.PigServer.registerQuery(PigServer.java:534)
at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:871)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:388)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:76)
at org.apache.pig.Main.run(Main.java:455)
at org.apache.pig.Main.main(Main.java:107)
Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Problems in merging user defined schema
at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:853)
at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1612)
... 9 more
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1016: Problems in merging user defined schema
at org.apache.pig.impl.logicalLayer.LOForEach.getSchema(LOForEach.java:337)
at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:851)
... 11 more
Caused by: org.apache.pig.impl.logicalLayer.schema.SchemaMergeException: ERROR 1022: Type mismatch merging schema prefix. Field Schema: double. Other Field Schema: firstMonth: chararray
at org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.mergePrefixFieldSchema(Schema.java:550)
at org.apache.pig.impl.logicalLayer.schema.Schema$FieldSchema.mergePrefixFieldSchema(Schema.java:474)
at org.apache.pig.impl.logicalLayer.LOForEach.getSchema(LOForEach.java:332)
... 12 more
================================================================================
{noformat}
> Documentation says the MAX function can be used on chararray, but it can't
> --------------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: documentation
> Reporter: Graham Lea
> Priority: Trivial
> Labels: documentation
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2303) Documentation says the MAX function
can be used on chararray, but it can't
Posted by "Daniel Dai (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114912#comment-13114912 ]
Daniel Dai commented on PIG-2303:
---------------------------------
It's not about MAX/MIN. "+" does not concat strings. You need to use CONCAT UDF.
> Documentation says the MAX function can be used on chararray, but it can't
> --------------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: documentation
> Reporter: Graham Lea
> Priority: Trivial
> Labels: documentation
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (PIG-2303) Documentation says the MAX function
can be used on chararray, but it can't
Posted by "Graham Lea (Reopened) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Graham Lea reopened PIG-2303:
-----------------------------
I don't believe I'm using any string concatenation, unless this is happening under the hood without my knowledge.
Here's a simple script showing what I described:
{code}
$ bin/pig
2011-09-27 09:01:26,411 [main] INFO org.apache.pig.Main - Logging error messages to: /home/development/pig-0.8.1-cdh3u1/pig_1317078086408.log
2011-09-27 09:01:26,592 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://localhost:9000
2011-09-27 09:01:26,751 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: localhost:9001
grunt> A = LOAD 'clientInvoiceIds' AS (clientId:int, invoiceMonth:chararray);
grunt> B = GROUP A BY clientId;
grunt> describe B
B: {group: int,A: {clientId: int,invoiceMonth: chararray}}
grunt> C = FOREACH B GENERATE group, MIN(A.invoiceMonth) as firstMonth:chararray, MAX(A.invoiceMonth) as lastMonth:chararray;
2011-09-27 09:01:45,992 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1022: Type mismatch merging schema prefix. Field Schema: double. Other Field Schema: firstMonth: chararray
Details at logfile: /home/development/pig-0.8.1-cdh3u1/pig_1317078086408.log
{code}
> Documentation says the MAX function can be used on chararray, but it can't
> --------------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: documentation
> Reporter: Graham Lea
> Priority: Trivial
> Labels: documentation
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2303) Documentation says the MAX function
can be used on chararray, but it can't
Posted by "Graham Lea (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13115082#comment-13115082 ]
Graham Lea commented on PIG-2303:
---------------------------------
Hi Daniel.
If you just add a schema to the definition of 'C' in your own script, you will see the problem:
{code}
C = foreach B generate MAX(A.name) as maxName: chararray;
{code}
Result:
{noformat}
ERROR 1022: Type mismatch merging schema prefix. Field Schema: double. Other Field Schema: maxName: chararray
{noformat}
Interestingly, without the schema a describe of C produces "C: {chararray}".
So it seems that the type of the function is correct (chararray in, chararray out) but perhaps the schema processing of FOREACH is incorrectly inferring the result type to be double?
> Documentation says the MAX function can be used on chararray, but it can't
> --------------------------------------------------------------------------
>
> Key: PIG-2303
> URL: https://issues.apache.org/jira/browse/PIG-2303
> Project: Pig
> Issue Type: Bug
> Components: parser
> Affects Versions: 0.9.0
> Reporter: Graham Lea
> Priority: Trivial
>
> Here: http://pig.apache.org/docs/r0.9.0/func.html#max
> It says MIN/Max can be used on chararray, but the result of those functions is always a double.
> Had to search through the Pig Javadoc to find that I should use StringMin/StringMax instead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira