You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by yogesh dhari <yo...@live.com> on 2012/09/30 00:02:27 UTC

how to perform GROUP BY in PIG for this case:


Hi all,



I have this data, having fields  (Date, symbol, rate)



and I want it to be group by Months, and to find out the maximum rate value for each month.



like: for month (08, 36.3), (09, 36.4), (10, 36.8), (11, 37.5) ..  





(2009-08-21,CLI,33.38)

(2009-08-24,CLI,33.03)

(2009-08-25,CLI,33.16)

(2009-08-26,CLI,32.78)

(2009-08-27,CLI,32.79)

(2009-08-28,CLI,33.37)

(2009-08-31,CLI,32.51)

(2009-09-11,CLI,34.08)

(2009-09-14,CLI,35.19)

(2009-09-15,CLI,35.82)

(2009-09-16,CLI,36.58)

(2009-09-24,CLI,33.98)

(2009-09-25,CLI,32.44)

(2009-09-28,CLI,33.34)

(2009-09-29,CLI,33.6)

(2009-09-30,CLI,33.24)

(2009-10-01,CLI,31.98)

(2009-10-02,CLI,31.21)

(2009-10-05,CLI,31.31)

(2009-10-21,CLI,32.86)

(2009-10-26,CLI,33.15)

(2009-10-27,CLI,32.71)

(2009-10-28,CLI,32.03)

(2009-10-29,CLI,32.05)

(2009-10-30,CLI,31.88)

(2009-11-02,CLI,31.88)

(2009-11-03,CLI,31.16)

(2009-11-04,CLI,31.47)

(2009-11-09,CLI,31.59)

(2009-11-25,CLI,30.58)

(2009-11-27,CLI,30.19)

(2009-11-30,CLI,30.86)

(2009-12-01,CLI,31.74)

(2009-12-02,CLI,32.62)

(2009-12-03,CLI,33.43)

(2009-12-04,CLI,34.12)

(2009-12-07,CLI,33.77)

(2009-12-08,CLI,33.8)

(2009-12-09,CLI,33.71)



Please help and suggest .



Thanks & Regards

Yogesh Kumar




 		 	   		  

RE: how to perform GROUP BY in PIG for this case:

Posted by yogesh dhari <yo...@live.com>.
Thanks Russell :-),

I have build the pig in /opt/pig-0.10.0/
and in 
/opt/pig-0.10.0/contrib/Piggybank/java/

and the jar files present there

and registered
to
grunt> register /opt/pig-0.10.0/contrib/piggybank/java/piggybank.jar
grunt> register /opt/pig-0.10.0/build/ivy/lib/Pig/joda-time-1.6.jar

and also defined

grunt> define CustomFormatToISO  org.apache.pig.piggybank.evaluation.datetime.convert.CustomFormatToISO()  ;
grunt> define ISOToMonth  org.apache.pig.piggybank.evaluation.datetime.convert.ISOToMonth();   

Now I performed the query on NYSE_B.

grunt> describe NYSE_B;  
                                                                                    
NYSE_B: {exchange: chararray,symbol: chararray,date: chararray,divi: float}

ans = foreach (group NYSE_B by ISOToMonth(date)) generate group as monthh, MAX(NYSE_A.divi) as max_rt;

and again got the ERROR :-( 

2012-09-30 10:25:15,821 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve org.apache.pig.piggybank.evaluation.datetime.convert.ISOToMonth using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
2012-09-30 10:25:15,822 [main] WARN  org.apache.pig.tools.grunt.Grunt - There is no log file to write to.
2012-09-30 10:25:15,822 [main] ERROR org.apache.pig.tools.grunt.Grunt - Failed to parse: Pig script failed to parse: 
<line 12, column 31> Failed to generate logical plan. Nested exception: java.lang.RuntimeException: Cannot instantiate: org.apache.pig.piggybank.evaluation.datetime.convert.ISOToMonth
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:182)
    at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1565)
    at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1538)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:540)
    at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69)
    at org.apache.pig.Main.run(Main.java:490)
    at org.apache.pig.Main.main(Main.java:111)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: 
<line 12, column 31> Failed to generate logical plan. Nested exception: java.lang.RuntimeException: Cannot instantiate: org.apache.pig.piggybank.evaluation.datetime.convert.ISOToMonth
    at org.apache.pig.parser.LogicalPlanBuilder.buildUDF(LogicalPlanBuilder.java:980)
    at org.apache.pig.parser.LogicalPlanGenerator.func_eval(LogicalPlanGenerator.java:7316)
    at org.apache.pig.parser.LogicalPlanGenerator.projectable_expr(LogicalPlanGenerator.java:8857)
    at org.apache.pig.parser.LogicalPlanGenerator.var_expr(LogicalPlanGenerator.java:8632)
    at org.apache.pig.parser.LogicalPlanGenerator.expr(LogicalPlanGenerator.java:7984)
    at org.apache.pig.parser.LogicalPlanGenerator.join_group_by_expr(LogicalPlanGenerator.java:12100)
    at org.apache.pig.parser.LogicalPlanGenerator.join_group_by_clause(LogicalPlanGenerator.java:11921)
    at org.apache.pig.parser.LogicalPlanGenerator.group_item(LogicalPlanGenerator.java:5440)
    at org.apache.pig.parser.LogicalPlanGenerator.group_clause(LogicalPlanGenerator.java:5026)
    at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1313)
    at org.apache.pig.parser.LogicalPlanGenerator.inline_op(LogicalPlanGenerator.java:5739)
    at org.apache.pig.parser.LogicalPlanGenerator.rel(LogicalPlanGenerator.java:5669)
    at org.apache.pig.parser.LogicalPlanGenerator.foreach_clause(LogicalPlanGenerator.java:12350)
    at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1577)
    at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789)
    at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507)
    at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382)
    at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)
    ... 15 more


Please suggest and help

Thanks & regards
Yogesh Kumar.



> From: russell.jurney@gmail.com
> Date: Sat, 29 Sep 2012 20:36:47 -0700
> Subject: Re: how to perform GROUP BY in PIG for this case:
> To: user@pig.apache.org
> 
> You'll need to build pig. Assuming you have the source, run 'ant' in
> the base directory and in contrib/Piggybank/java
> 
> Russell Jurney http://datasyndrome.com
> 
> On Sep 29, 2012, at 8:19 PM, yogesh dhari <yo...@live.com> wrote:
> 
> >
> >
> > Hi russell,
> >
> > I am using Pig-0.10.0 version and I checked the directory /opt/Pig-0.10.0/contrib/piggybank/java/
> >
> > there is no any jar files. :-(
> >
> > grunt> register /opt/pig-0.10.0/contrib/piggybank/java/piggybank.jar
> >
> > [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: file '/opt/pig-0.10.0/contrib/piggybank/java/piggybank.jar' does not exist.
> > Details at logfile: /opt/pig-0.10.0/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/datetime/convert/pig_1348974384533.log
> >
> > similarly
> >
> > there is no path /opt/build/ivy/lib/Pig/
> >
> > instead /opt/pig-0.10.0/ivy is there. but it has no /lib/Pig/
> >
> > Please suggest  & help
> >
> > Thanks & regards
> > Yogesh Kumar
> >
> >
> >
> >
> >> From: russell.jurney@gmail.com
> >> Date: Sat, 29 Sep 2012 19:21:17 -0700
> >> Subject: Re: how to perform GROUP BY in PIG for this case:
> >> To: user@pig.apache.org
> >>
> >> My bad - you will need to register the Piggybank and jodatime jars. Replace
> >> /me/pig with your pig install path.
> >>
> >> register /me/pig/contrib/piggybank/java/piggybank.jar
> >> register /me/pig/build/ivy/lib/Pig/joda-time-1.6.jar
> >>
> >> define CustomFormatToISO
> >> org.apache.pig.piggybank.evaluation.datetime.convert.CustomFormatToISO();
> >>
> >> define ISOToMonth
> >> org.apache.pig.piggybank.evaluation.datetime.truncate.ISOToMonth()
> >>
> >>
> >> That should take care of the error.
> >>
> >> This example may help:
> >> https://github.com/rjurney/Collecting-Data/blob/master/src/pig/rfc1123_to_iso8601.pig
> >>
> >> Russell Jurney http://datasyndrome.com
> >>
> >> On Sep 29, 2012, at 4:33 PM, yogesh dhari <yo...@live.com> wrote:
> >>
> >>
> >> Thanks Russell,
> >>
> >> I am new to Pig. I have tried this command.
> >> and got this exception.
> >>
> >> 2012-09-30 04:53:22,995 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> >> ERROR 1070: Could not resolve ISOToMonth using imports: [,
> >> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
> >>
> >> Is there some thing more I need to do like import or some thing like that.
> >>
> >> Please suggest.
> >>
> >> Thanks & regards
> >> Yogesh Kumar
> >>
> >> From: russell.jurney@gmail.com
> >>
> >> Date: Sat, 29 Sep 2012 16:15:18 -0700
> >>
> >> Subject: Re: how to perform GROUP BY in PIG for this case:
> >>
> >> To: user@pig.apache.org
> >>
> >>
> >> answer = foreach (group data by ISOToMonth(Date)) generate group as
> >>
> >> month, MAX(data.rate) as max_rate;
> >>
> >>
> >> Note, you will need your date in ISO8601 format, and you can use
> >>
> >> CustomFormatToISO to convert it if it's is a string, or UnixToISO if
> >>
> >> your date is a long.
> >>
> >>
> >> See:
> >>
> >>
> >> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/CustomFormatToISO.html
> >>
> >> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/UnixToISO.html
> >>
> >> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/truncate/ISOToMonth.html
> >>
> >>
> >>
> >>
> >> Russell Jurney http://datasyndrome.com
> >>
> >>
> >> On Sep 29, 2012, at 3:02 PM, yogesh dhari <yo...@live.com> wrote:
> >>
> >>
> >>
> >>
> >> Hi all,
> >>
> >>
> >>
> >>
> >> I have this data, having fields  (Date, symbol, rate)
> >>
> >>
> >>
> >>
> >> and I want it to be group by Months, and to find out the maximum rate value
> >> for each month.
> >>
> >>
> >>
> >>
> >> like: for month (08, 36.3), (09, 36.4), (10, 36.8), (11, 37.5) ..
> >>
> >>
> >>
> >>
> >>
> >>
> >> (2009-08-21,CLI,33.38)
> >>
> >>
> >> (2009-08-24,CLI,33.03)
> >>
> >>
> >> (2009-08-25,CLI,33.16)
> >>
> >>
> >> (2009-08-26,CLI,32.78)
> >>
> >>
> >> (2009-08-27,CLI,32.79)
> >>
> >>
> >> (2009-08-28,CLI,33.37)
> >>
> >>
> >> (2009-08-31,CLI,32.51)
> >>
> >>
> >> (2009-09-11,CLI,34.08)
> >>
> >>
> >> (2009-09-14,CLI,35.19)
> >>
> >>
> >> (2009-09-15,CLI,35.82)
> >>
> >>
> >> (2009-09-16,CLI,36.58)
> >>
> >>
> >> (2009-09-24,CLI,33.98)
> >>
> >>
> >> (2009-09-25,CLI,32.44)
> >>
> >>
> >> (2009-09-28,CLI,33.34)
> >>
> >>
> >> (2009-09-29,CLI,33.6)
> >>
> >>
> >> (2009-09-30,CLI,33.24)
> >>
> >>
> >> (2009-10-01,CLI,31.98)
> >>
> >>
> >> (2009-10-02,CLI,31.21)
> >>
> >>
> >> (2009-10-05,CLI,31.31)
> >>
> >>
> >> (2009-10-21,CLI,32.86)
> >>
> >>
> >> (2009-10-26,CLI,33.15)
> >>
> >>
> >> (2009-10-27,CLI,32.71)
> >>
> >>
> >> (2009-10-28,CLI,32.03)
> >>
> >>
> >> (2009-10-29,CLI,32.05)
> >>
> >>
> >> (2009-10-30,CLI,31.88)
> >>
> >>
> >> (2009-11-02,CLI,31.88)
> >>
> >>
> >> (2009-11-03,CLI,31.16)
> >>
> >>
> >> (2009-11-04,CLI,31.47)
> >>
> >>
> >> (2009-11-09,CLI,31.59)
> >>
> >>
> >> (2009-11-25,CLI,30.58)
> >>
> >>
> >> (2009-11-27,CLI,30.19)
> >>
> >>
> >> (2009-11-30,CLI,30.86)
> >>
> >>
> >> (2009-12-01,CLI,31.74)
> >>
> >>
> >> (2009-12-02,CLI,32.62)
> >>
> >>
> >> (2009-12-03,CLI,33.43)
> >>
> >>
> >> (2009-12-04,CLI,34.12)
> >>
> >>
> >> (2009-12-07,CLI,33.77)
> >>
> >>
> >> (2009-12-08,CLI,33.8)
> >>
> >>
> >> (2009-12-09,CLI,33.71)
> >>
> >>
> >>
> >>
> >> Please help and suggest .
> >>
> >>
> >>
> >>
> >> Thanks & Regards
> >>
> >>
> >> Yogesh Kumar
> >
 		 	   		  

Re: how to perform GROUP BY in PIG for this case:

Posted by Russell Jurney <ru...@gmail.com>.
You'll need to build pig. Assuming you have the source, run 'ant' in
the base directory and in contrib/Piggybank/java

Russell Jurney http://datasyndrome.com

On Sep 29, 2012, at 8:19 PM, yogesh dhari <yo...@live.com> wrote:

>
>
> Hi russell,
>
> I am using Pig-0.10.0 version and I checked the directory /opt/Pig-0.10.0/contrib/piggybank/java/
>
> there is no any jar files. :-(
>
> grunt> register /opt/pig-0.10.0/contrib/piggybank/java/piggybank.jar
>
> [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: file '/opt/pig-0.10.0/contrib/piggybank/java/piggybank.jar' does not exist.
> Details at logfile: /opt/pig-0.10.0/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/datetime/convert/pig_1348974384533.log
>
> similarly
>
> there is no path /opt/build/ivy/lib/Pig/
>
> instead /opt/pig-0.10.0/ivy is there. but it has no /lib/Pig/
>
> Please suggest  & help
>
> Thanks & regards
> Yogesh Kumar
>
>
>
>
>> From: russell.jurney@gmail.com
>> Date: Sat, 29 Sep 2012 19:21:17 -0700
>> Subject: Re: how to perform GROUP BY in PIG for this case:
>> To: user@pig.apache.org
>>
>> My bad - you will need to register the Piggybank and jodatime jars. Replace
>> /me/pig with your pig install path.
>>
>> register /me/pig/contrib/piggybank/java/piggybank.jar
>> register /me/pig/build/ivy/lib/Pig/joda-time-1.6.jar
>>
>> define CustomFormatToISO
>> org.apache.pig.piggybank.evaluation.datetime.convert.CustomFormatToISO();
>>
>> define ISOToMonth
>> org.apache.pig.piggybank.evaluation.datetime.truncate.ISOToMonth()
>>
>>
>> That should take care of the error.
>>
>> This example may help:
>> https://github.com/rjurney/Collecting-Data/blob/master/src/pig/rfc1123_to_iso8601.pig
>>
>> Russell Jurney http://datasyndrome.com
>>
>> On Sep 29, 2012, at 4:33 PM, yogesh dhari <yo...@live.com> wrote:
>>
>>
>> Thanks Russell,
>>
>> I am new to Pig. I have tried this command.
>> and got this exception.
>>
>> 2012-09-30 04:53:22,995 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>> ERROR 1070: Could not resolve ISOToMonth using imports: [,
>> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
>>
>> Is there some thing more I need to do like import or some thing like that.
>>
>> Please suggest.
>>
>> Thanks & regards
>> Yogesh Kumar
>>
>> From: russell.jurney@gmail.com
>>
>> Date: Sat, 29 Sep 2012 16:15:18 -0700
>>
>> Subject: Re: how to perform GROUP BY in PIG for this case:
>>
>> To: user@pig.apache.org
>>
>>
>> answer = foreach (group data by ISOToMonth(Date)) generate group as
>>
>> month, MAX(data.rate) as max_rate;
>>
>>
>> Note, you will need your date in ISO8601 format, and you can use
>>
>> CustomFormatToISO to convert it if it's is a string, or UnixToISO if
>>
>> your date is a long.
>>
>>
>> See:
>>
>>
>> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/CustomFormatToISO.html
>>
>> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/UnixToISO.html
>>
>> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/truncate/ISOToMonth.html
>>
>>
>>
>>
>> Russell Jurney http://datasyndrome.com
>>
>>
>> On Sep 29, 2012, at 3:02 PM, yogesh dhari <yo...@live.com> wrote:
>>
>>
>>
>>
>> Hi all,
>>
>>
>>
>>
>> I have this data, having fields  (Date, symbol, rate)
>>
>>
>>
>>
>> and I want it to be group by Months, and to find out the maximum rate value
>> for each month.
>>
>>
>>
>>
>> like: for month (08, 36.3), (09, 36.4), (10, 36.8), (11, 37.5) ..
>>
>>
>>
>>
>>
>>
>> (2009-08-21,CLI,33.38)
>>
>>
>> (2009-08-24,CLI,33.03)
>>
>>
>> (2009-08-25,CLI,33.16)
>>
>>
>> (2009-08-26,CLI,32.78)
>>
>>
>> (2009-08-27,CLI,32.79)
>>
>>
>> (2009-08-28,CLI,33.37)
>>
>>
>> (2009-08-31,CLI,32.51)
>>
>>
>> (2009-09-11,CLI,34.08)
>>
>>
>> (2009-09-14,CLI,35.19)
>>
>>
>> (2009-09-15,CLI,35.82)
>>
>>
>> (2009-09-16,CLI,36.58)
>>
>>
>> (2009-09-24,CLI,33.98)
>>
>>
>> (2009-09-25,CLI,32.44)
>>
>>
>> (2009-09-28,CLI,33.34)
>>
>>
>> (2009-09-29,CLI,33.6)
>>
>>
>> (2009-09-30,CLI,33.24)
>>
>>
>> (2009-10-01,CLI,31.98)
>>
>>
>> (2009-10-02,CLI,31.21)
>>
>>
>> (2009-10-05,CLI,31.31)
>>
>>
>> (2009-10-21,CLI,32.86)
>>
>>
>> (2009-10-26,CLI,33.15)
>>
>>
>> (2009-10-27,CLI,32.71)
>>
>>
>> (2009-10-28,CLI,32.03)
>>
>>
>> (2009-10-29,CLI,32.05)
>>
>>
>> (2009-10-30,CLI,31.88)
>>
>>
>> (2009-11-02,CLI,31.88)
>>
>>
>> (2009-11-03,CLI,31.16)
>>
>>
>> (2009-11-04,CLI,31.47)
>>
>>
>> (2009-11-09,CLI,31.59)
>>
>>
>> (2009-11-25,CLI,30.58)
>>
>>
>> (2009-11-27,CLI,30.19)
>>
>>
>> (2009-11-30,CLI,30.86)
>>
>>
>> (2009-12-01,CLI,31.74)
>>
>>
>> (2009-12-02,CLI,32.62)
>>
>>
>> (2009-12-03,CLI,33.43)
>>
>>
>> (2009-12-04,CLI,34.12)
>>
>>
>> (2009-12-07,CLI,33.77)
>>
>>
>> (2009-12-08,CLI,33.8)
>>
>>
>> (2009-12-09,CLI,33.71)
>>
>>
>>
>>
>> Please help and suggest .
>>
>>
>>
>>
>> Thanks & Regards
>>
>>
>> Yogesh Kumar
>

RE: how to perform GROUP BY in PIG for this case:

Posted by yogesh dhari <yo...@live.com>.

Hi russell,

I am using Pig-0.10.0 version and I checked the directory /opt/Pig-0.10.0/contrib/piggybank/java/

there is no any jar files. :-( 

grunt> register /opt/pig-0.10.0/contrib/piggybank/java/piggybank.jar

 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 101: file '/opt/pig-0.10.0/contrib/piggybank/java/piggybank.jar' does not exist.
Details at logfile: /opt/pig-0.10.0/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/evaluation/datetime/convert/pig_1348974384533.log

similarly 

there is no path /opt/build/ivy/lib/Pig/

instead /opt/pig-0.10.0/ivy is there. but it has no /lib/Pig/

Please suggest  & help

Thanks & regards
Yogesh Kumar




> From: russell.jurney@gmail.com
> Date: Sat, 29 Sep 2012 19:21:17 -0700
> Subject: Re: how to perform GROUP BY in PIG for this case:
> To: user@pig.apache.org
> 
> My bad - you will need to register the Piggybank and jodatime jars. Replace
> /me/pig with your pig install path.
> 
> register /me/pig/contrib/piggybank/java/piggybank.jar
> register /me/pig/build/ivy/lib/Pig/joda-time-1.6.jar
> 
> define CustomFormatToISO
> org.apache.pig.piggybank.evaluation.datetime.convert.CustomFormatToISO();
> 
> define ISOToMonth
> org.apache.pig.piggybank.evaluation.datetime.truncate.ISOToMonth()
> 
> 
> That should take care of the error.
> 
> This example may help:
> https://github.com/rjurney/Collecting-Data/blob/master/src/pig/rfc1123_to_iso8601.pig
> 
> Russell Jurney http://datasyndrome.com
> 
> On Sep 29, 2012, at 4:33 PM, yogesh dhari <yo...@live.com> wrote:
> 
> 
> Thanks Russell,
> 
> I am new to Pig. I have tried this command.
> and got this exception.
> 
> 2012-09-30 04:53:22,995 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1070: Could not resolve ISOToMonth using imports: [,
> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
> 
> Is there some thing more I need to do like import or some thing like that.
> 
> Please suggest.
> 
> Thanks & regards
> Yogesh Kumar
> 
> From: russell.jurney@gmail.com
> 
> Date: Sat, 29 Sep 2012 16:15:18 -0700
> 
> Subject: Re: how to perform GROUP BY in PIG for this case:
> 
> To: user@pig.apache.org
> 
> 
> answer = foreach (group data by ISOToMonth(Date)) generate group as
> 
> month, MAX(data.rate) as max_rate;
> 
> 
> Note, you will need your date in ISO8601 format, and you can use
> 
> CustomFormatToISO to convert it if it's is a string, or UnixToISO if
> 
> your date is a long.
> 
> 
> See:
> 
> 
> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/CustomFormatToISO.html
> 
> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/UnixToISO.html
> 
> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/truncate/ISOToMonth.html
> 
> 
> 
> 
> Russell Jurney http://datasyndrome.com
> 
> 
> On Sep 29, 2012, at 3:02 PM, yogesh dhari <yo...@live.com> wrote:
> 
> 
> 
> 
> Hi all,
> 
> 
> 
> 
> I have this data, having fields  (Date, symbol, rate)
> 
> 
> 
> 
> and I want it to be group by Months, and to find out the maximum rate value
> for each month.
> 
> 
> 
> 
> like: for month (08, 36.3), (09, 36.4), (10, 36.8), (11, 37.5) ..
> 
> 
> 
> 
> 
> 
> (2009-08-21,CLI,33.38)
> 
> 
> (2009-08-24,CLI,33.03)
> 
> 
> (2009-08-25,CLI,33.16)
> 
> 
> (2009-08-26,CLI,32.78)
> 
> 
> (2009-08-27,CLI,32.79)
> 
> 
> (2009-08-28,CLI,33.37)
> 
> 
> (2009-08-31,CLI,32.51)
> 
> 
> (2009-09-11,CLI,34.08)
> 
> 
> (2009-09-14,CLI,35.19)
> 
> 
> (2009-09-15,CLI,35.82)
> 
> 
> (2009-09-16,CLI,36.58)
> 
> 
> (2009-09-24,CLI,33.98)
> 
> 
> (2009-09-25,CLI,32.44)
> 
> 
> (2009-09-28,CLI,33.34)
> 
> 
> (2009-09-29,CLI,33.6)
> 
> 
> (2009-09-30,CLI,33.24)
> 
> 
> (2009-10-01,CLI,31.98)
> 
> 
> (2009-10-02,CLI,31.21)
> 
> 
> (2009-10-05,CLI,31.31)
> 
> 
> (2009-10-21,CLI,32.86)
> 
> 
> (2009-10-26,CLI,33.15)
> 
> 
> (2009-10-27,CLI,32.71)
> 
> 
> (2009-10-28,CLI,32.03)
> 
> 
> (2009-10-29,CLI,32.05)
> 
> 
> (2009-10-30,CLI,31.88)
> 
> 
> (2009-11-02,CLI,31.88)
> 
> 
> (2009-11-03,CLI,31.16)
> 
> 
> (2009-11-04,CLI,31.47)
> 
> 
> (2009-11-09,CLI,31.59)
> 
> 
> (2009-11-25,CLI,30.58)
> 
> 
> (2009-11-27,CLI,30.19)
> 
> 
> (2009-11-30,CLI,30.86)
> 
> 
> (2009-12-01,CLI,31.74)
> 
> 
> (2009-12-02,CLI,32.62)
> 
> 
> (2009-12-03,CLI,33.43)
> 
> 
> (2009-12-04,CLI,34.12)
> 
> 
> (2009-12-07,CLI,33.77)
> 
> 
> (2009-12-08,CLI,33.8)
> 
> 
> (2009-12-09,CLI,33.71)
> 
> 
> 
> 
> Please help and suggest .
> 
> 
> 
> 
> Thanks & Regards
> 
> 
> Yogesh Kumar
 		 	   		  

Re: how to perform GROUP BY in PIG for this case:

Posted by Russell Jurney <ru...@gmail.com>.
My bad - you will need to register the Piggybank and jodatime jars. Replace
/me/pig with your pig install path.

register /me/pig/contrib/piggybank/java/piggybank.jar
register /me/pig/build/ivy/lib/Pig/joda-time-1.6.jar

define CustomFormatToISO
org.apache.pig.piggybank.evaluation.datetime.convert.CustomFormatToISO();

define ISOToMonth
org.apache.pig.piggybank.evaluation.datetime.truncate.ISOToMonth()


That should take care of the error.

This example may help:
https://github.com/rjurney/Collecting-Data/blob/master/src/pig/rfc1123_to_iso8601.pig

Russell Jurney http://datasyndrome.com

On Sep 29, 2012, at 4:33 PM, yogesh dhari <yo...@live.com> wrote:


Thanks Russell,

I am new to Pig. I have tried this command.
and got this exception.

2012-09-30 04:53:22,995 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1070: Could not resolve ISOToMonth using imports: [,
org.apache.pig.builtin., org.apache.pig.impl.builtin.]

Is there some thing more I need to do like import or some thing like that.

Please suggest.

Thanks & regards
Yogesh Kumar

From: russell.jurney@gmail.com

Date: Sat, 29 Sep 2012 16:15:18 -0700

Subject: Re: how to perform GROUP BY in PIG for this case:

To: user@pig.apache.org


answer = foreach (group data by ISOToMonth(Date)) generate group as

month, MAX(data.rate) as max_rate;


Note, you will need your date in ISO8601 format, and you can use

CustomFormatToISO to convert it if it's is a string, or UnixToISO if

your date is a long.


See:


http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/CustomFormatToISO.html

http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/UnixToISO.html

http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/truncate/ISOToMonth.html




Russell Jurney http://datasyndrome.com


On Sep 29, 2012, at 3:02 PM, yogesh dhari <yo...@live.com> wrote:




Hi all,




I have this data, having fields  (Date, symbol, rate)




and I want it to be group by Months, and to find out the maximum rate value
for each month.




like: for month (08, 36.3), (09, 36.4), (10, 36.8), (11, 37.5) ..






(2009-08-21,CLI,33.38)


(2009-08-24,CLI,33.03)


(2009-08-25,CLI,33.16)


(2009-08-26,CLI,32.78)


(2009-08-27,CLI,32.79)


(2009-08-28,CLI,33.37)


(2009-08-31,CLI,32.51)


(2009-09-11,CLI,34.08)


(2009-09-14,CLI,35.19)


(2009-09-15,CLI,35.82)


(2009-09-16,CLI,36.58)


(2009-09-24,CLI,33.98)


(2009-09-25,CLI,32.44)


(2009-09-28,CLI,33.34)


(2009-09-29,CLI,33.6)


(2009-09-30,CLI,33.24)


(2009-10-01,CLI,31.98)


(2009-10-02,CLI,31.21)


(2009-10-05,CLI,31.31)


(2009-10-21,CLI,32.86)


(2009-10-26,CLI,33.15)


(2009-10-27,CLI,32.71)


(2009-10-28,CLI,32.03)


(2009-10-29,CLI,32.05)


(2009-10-30,CLI,31.88)


(2009-11-02,CLI,31.88)


(2009-11-03,CLI,31.16)


(2009-11-04,CLI,31.47)


(2009-11-09,CLI,31.59)


(2009-11-25,CLI,30.58)


(2009-11-27,CLI,30.19)


(2009-11-30,CLI,30.86)


(2009-12-01,CLI,31.74)


(2009-12-02,CLI,32.62)


(2009-12-03,CLI,33.43)


(2009-12-04,CLI,34.12)


(2009-12-07,CLI,33.77)


(2009-12-08,CLI,33.8)


(2009-12-09,CLI,33.71)




Please help and suggest .




Thanks & Regards


Yogesh Kumar

RE: how to perform GROUP BY in PIG for this case:

Posted by yogesh dhari <yo...@live.com>.
Thanks Russell,

I am new to Pig. I have tried this command.
and got this exception.

2012-09-30 04:53:22,995 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve ISOToMonth using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

Is there some thing more I need to do like import or some thing like that.

Please suggest.

Thanks & regards
Yogesh Kumar

> From: russell.jurney@gmail.com
> Date: Sat, 29 Sep 2012 16:15:18 -0700
> Subject: Re: how to perform GROUP BY in PIG for this case:
> To: user@pig.apache.org
> 
> answer = foreach (group data by ISOToMonth(Date)) generate group as
> month, MAX(data.rate) as max_rate;
> 
> Note, you will need your date in ISO8601 format, and you can use
> CustomFormatToISO to convert it if it's is a string, or UnixToISO if
> your date is a long.
> 
> See:
> 
> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/CustomFormatToISO.html
> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/UnixToISO.html
> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/truncate/ISOToMonth.html
> 
> 
> 
> Russell Jurney http://datasyndrome.com
> 
> On Sep 29, 2012, at 3:02 PM, yogesh dhari <yo...@live.com> wrote:
> 
> >
> >
> > Hi all,
> >
> >
> >
> > I have this data, having fields  (Date, symbol, rate)
> >
> >
> >
> > and I want it to be group by Months, and to find out the maximum rate value for each month.
> >
> >
> >
> > like: for month (08, 36.3), (09, 36.4), (10, 36.8), (11, 37.5) ..
> >
> >
> >
> >
> >
> > (2009-08-21,CLI,33.38)
> >
> > (2009-08-24,CLI,33.03)
> >
> > (2009-08-25,CLI,33.16)
> >
> > (2009-08-26,CLI,32.78)
> >
> > (2009-08-27,CLI,32.79)
> >
> > (2009-08-28,CLI,33.37)
> >
> > (2009-08-31,CLI,32.51)
> >
> > (2009-09-11,CLI,34.08)
> >
> > (2009-09-14,CLI,35.19)
> >
> > (2009-09-15,CLI,35.82)
> >
> > (2009-09-16,CLI,36.58)
> >
> > (2009-09-24,CLI,33.98)
> >
> > (2009-09-25,CLI,32.44)
> >
> > (2009-09-28,CLI,33.34)
> >
> > (2009-09-29,CLI,33.6)
> >
> > (2009-09-30,CLI,33.24)
> >
> > (2009-10-01,CLI,31.98)
> >
> > (2009-10-02,CLI,31.21)
> >
> > (2009-10-05,CLI,31.31)
> >
> > (2009-10-21,CLI,32.86)
> >
> > (2009-10-26,CLI,33.15)
> >
> > (2009-10-27,CLI,32.71)
> >
> > (2009-10-28,CLI,32.03)
> >
> > (2009-10-29,CLI,32.05)
> >
> > (2009-10-30,CLI,31.88)
> >
> > (2009-11-02,CLI,31.88)
> >
> > (2009-11-03,CLI,31.16)
> >
> > (2009-11-04,CLI,31.47)
> >
> > (2009-11-09,CLI,31.59)
> >
> > (2009-11-25,CLI,30.58)
> >
> > (2009-11-27,CLI,30.19)
> >
> > (2009-11-30,CLI,30.86)
> >
> > (2009-12-01,CLI,31.74)
> >
> > (2009-12-02,CLI,32.62)
> >
> > (2009-12-03,CLI,33.43)
> >
> > (2009-12-04,CLI,34.12)
> >
> > (2009-12-07,CLI,33.77)
> >
> > (2009-12-08,CLI,33.8)
> >
> > (2009-12-09,CLI,33.71)
> >
> >
> >
> > Please help and suggest .
> >
> >
> >
> > Thanks & Regards
> >
> > Yogesh Kumar
> >
> >
> >
> >
> >
 		 	   		  

Re: how to perform GROUP BY in PIG for this case:

Posted by Russell Jurney <ru...@gmail.com>.
answer = foreach (group data by ISOToMonth(Date)) generate group as
month, MAX(data.rate) as max_rate;

Note, you will need your date in ISO8601 format, and you can use
CustomFormatToISO to convert it if it's is a string, or UnixToISO if
your date is a long.

See:

http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/CustomFormatToISO.html
http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/convert/UnixToISO.html
http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/piggybank/evaluation/datetime/truncate/ISOToMonth.html



Russell Jurney http://datasyndrome.com

On Sep 29, 2012, at 3:02 PM, yogesh dhari <yo...@live.com> wrote:

>
>
> Hi all,
>
>
>
> I have this data, having fields  (Date, symbol, rate)
>
>
>
> and I want it to be group by Months, and to find out the maximum rate value for each month.
>
>
>
> like: for month (08, 36.3), (09, 36.4), (10, 36.8), (11, 37.5) ..
>
>
>
>
>
> (2009-08-21,CLI,33.38)
>
> (2009-08-24,CLI,33.03)
>
> (2009-08-25,CLI,33.16)
>
> (2009-08-26,CLI,32.78)
>
> (2009-08-27,CLI,32.79)
>
> (2009-08-28,CLI,33.37)
>
> (2009-08-31,CLI,32.51)
>
> (2009-09-11,CLI,34.08)
>
> (2009-09-14,CLI,35.19)
>
> (2009-09-15,CLI,35.82)
>
> (2009-09-16,CLI,36.58)
>
> (2009-09-24,CLI,33.98)
>
> (2009-09-25,CLI,32.44)
>
> (2009-09-28,CLI,33.34)
>
> (2009-09-29,CLI,33.6)
>
> (2009-09-30,CLI,33.24)
>
> (2009-10-01,CLI,31.98)
>
> (2009-10-02,CLI,31.21)
>
> (2009-10-05,CLI,31.31)
>
> (2009-10-21,CLI,32.86)
>
> (2009-10-26,CLI,33.15)
>
> (2009-10-27,CLI,32.71)
>
> (2009-10-28,CLI,32.03)
>
> (2009-10-29,CLI,32.05)
>
> (2009-10-30,CLI,31.88)
>
> (2009-11-02,CLI,31.88)
>
> (2009-11-03,CLI,31.16)
>
> (2009-11-04,CLI,31.47)
>
> (2009-11-09,CLI,31.59)
>
> (2009-11-25,CLI,30.58)
>
> (2009-11-27,CLI,30.19)
>
> (2009-11-30,CLI,30.86)
>
> (2009-12-01,CLI,31.74)
>
> (2009-12-02,CLI,32.62)
>
> (2009-12-03,CLI,33.43)
>
> (2009-12-04,CLI,34.12)
>
> (2009-12-07,CLI,33.77)
>
> (2009-12-08,CLI,33.8)
>
> (2009-12-09,CLI,33.71)
>
>
>
> Please help and suggest .
>
>
>
> Thanks & Regards
>
> Yogesh Kumar
>
>
>
>
>