You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Dexin Wang <wa...@gmail.com> on 2011/01/13 02:43:51 UTC

how to use builtin String functions

I see there are some builtin string functions, but I don't know how to use
them. I got this error when I follow the examples:

grunt> REGEX_EXTRACT_ALL('192.168.1.5:8020', '(.*)\:(.*)');
2011-01-12 19:34:23,773 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1000: Error during parsing. Encountered " <IDENTIFIER>
"REGEX_EXTRACT_ALL "" at line 1, column 1.

Thanks.

Re: how to use builtin String functions

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Hm, a bunch of UDFs were moved into Pig builtins from piggybank for 0.8. Try
building the piggybank and putting it on your classpath + registering it at
the head of your script.

On Thu, Jan 13, 2011 at 6:03 PM, Dexin Wang <wa...@gmail.com> wrote:

> jar tf pig-0.7.0_9-core.jar | grep builtin
>
> shows over 100 class files like MIN, MAX, SUM, COUNT, etc but no UPPER or
> any other string function I want to use. Of course "grep UPPER" shows
> nothing.
>
> Which jar file are these string functions supposed to be in?
>
> On Thu, Jan 13, 2011 at 5:50 PM, Dmitriy Ryaboy <dv...@gmail.com>
> wrote:
>
> > Try this:
> >
> > jar tf pig.jar | grep UPPER
> >
> > to see fi the UDFs are in your jar, and what package they live in.
> >
> > On Thu, Jan 13, 2011 at 5:38 PM, Dexin Wang <wa...@gmail.com> wrote:
> >
> > > Thanks.
> > >
> > > Somehow, it's not recognizing these functions.
> > >
> > > grunt> *DUMP A;*
> > > (a-b-c,1)
> > > (x-y,2)
> > > (z,3)
> > > grunt> *DESCRIBE A;*
> > > A: {code: chararray,v: int}
> > > grunt> *B = FOREACH A GENERATE REGEX_EXTRACT_ALL(code, '(.*)-(.*)');*
> > > 2011-01-13 17:35:48,062 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > > ERROR 1070: Could not resolve REGEX_EXTRACT_ALL using imports: [,
> > > org.apache.pig.builtin., org.apache.pig.impl.builtin.]
> > >
> > > It doesn't even like UPPER:
> > >
> > > grunt> *C = FOREACH A GENERATE UPPER(code);   *
> > > 2011-01-13 17:37:51,317 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > > ERROR 1070: Could not resolve UPPER using imports: [,
> > > org.apache.pig.builtin., org.apache.pig.impl.builtin.]
> > >
> > >
> > >
> > > On Wed, Jan 12, 2011 at 6:03 PM, Thejas M Nair <te...@yahoo-inc.com>
> > > wrote:
> > >
> > > >  The functions need to be part of an expression in a relational
> > operator,
> > > > for example -
> > > >
> > > > f = foreach l generate  REGEX_EXTRACT_ALL('192.168.1.5:8020',
> > > > '(.*)\\:(.*)');
> > > >
> > > > (the above example does not make much sense as none of the columns in
> > > input
> > > > relation are being used.)
> > > >
> > > > -Thejas
> > > >
> > > >
> > > >
> > > > On 1/12/11 5:43 PM, "Dexin Wang" <wa...@gmail.com> wrote:
> > > >
> > > > I see there are some builtin string functions, but I don't know how
> to
> > > use
> > > > them. I got this error when I follow the examples:
> > > >
> > > > grunt> REGEX_EXTRACT_ALL('192.168.1.5:8020', '(.*)\:(.*)');
> > > > 2011-01-12 19:34:23,773 [main] ERROR org.apache.pig.tools.grunt.Grunt
> -
> > > > ERROR 1000: Error during parsing. Encountered " <IDENTIFIER>
> > > > "REGEX_EXTRACT_ALL "" at line 1, column 1.
> > > >
> > > > Thanks.
> > > >
> > > >
> > > >
> > >
> >
>

Re: how to use builtin String functions

Posted by Dexin Wang <wa...@gmail.com>.
jar tf pig-0.7.0_9-core.jar | grep builtin

shows over 100 class files like MIN, MAX, SUM, COUNT, etc but no UPPER or
any other string function I want to use. Of course "grep UPPER" shows
nothing.

Which jar file are these string functions supposed to be in?

On Thu, Jan 13, 2011 at 5:50 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> Try this:
>
> jar tf pig.jar | grep UPPER
>
> to see fi the UDFs are in your jar, and what package they live in.
>
> On Thu, Jan 13, 2011 at 5:38 PM, Dexin Wang <wa...@gmail.com> wrote:
>
> > Thanks.
> >
> > Somehow, it's not recognizing these functions.
> >
> > grunt> *DUMP A;*
> > (a-b-c,1)
> > (x-y,2)
> > (z,3)
> > grunt> *DESCRIBE A;*
> > A: {code: chararray,v: int}
> > grunt> *B = FOREACH A GENERATE REGEX_EXTRACT_ALL(code, '(.*)-(.*)');*
> > 2011-01-13 17:35:48,062 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 1070: Could not resolve REGEX_EXTRACT_ALL using imports: [,
> > org.apache.pig.builtin., org.apache.pig.impl.builtin.]
> >
> > It doesn't even like UPPER:
> >
> > grunt> *C = FOREACH A GENERATE UPPER(code);   *
> > 2011-01-13 17:37:51,317 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 1070: Could not resolve UPPER using imports: [,
> > org.apache.pig.builtin., org.apache.pig.impl.builtin.]
> >
> >
> >
> > On Wed, Jan 12, 2011 at 6:03 PM, Thejas M Nair <te...@yahoo-inc.com>
> > wrote:
> >
> > >  The functions need to be part of an expression in a relational
> operator,
> > > for example -
> > >
> > > f = foreach l generate  REGEX_EXTRACT_ALL('192.168.1.5:8020',
> > > '(.*)\\:(.*)');
> > >
> > > (the above example does not make much sense as none of the columns in
> > input
> > > relation are being used.)
> > >
> > > -Thejas
> > >
> > >
> > >
> > > On 1/12/11 5:43 PM, "Dexin Wang" <wa...@gmail.com> wrote:
> > >
> > > I see there are some builtin string functions, but I don't know how to
> > use
> > > them. I got this error when I follow the examples:
> > >
> > > grunt> REGEX_EXTRACT_ALL('192.168.1.5:8020', '(.*)\:(.*)');
> > > 2011-01-12 19:34:23,773 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > > ERROR 1000: Error during parsing. Encountered " <IDENTIFIER>
> > > "REGEX_EXTRACT_ALL "" at line 1, column 1.
> > >
> > > Thanks.
> > >
> > >
> > >
> >
>

Re: how to use builtin String functions

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Try this:

jar tf pig.jar | grep UPPER

to see fi the UDFs are in your jar, and what package they live in.

On Thu, Jan 13, 2011 at 5:38 PM, Dexin Wang <wa...@gmail.com> wrote:

> Thanks.
>
> Somehow, it's not recognizing these functions.
>
> grunt> *DUMP A;*
> (a-b-c,1)
> (x-y,2)
> (z,3)
> grunt> *DESCRIBE A;*
> A: {code: chararray,v: int}
> grunt> *B = FOREACH A GENERATE REGEX_EXTRACT_ALL(code, '(.*)-(.*)');*
> 2011-01-13 17:35:48,062 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1070: Could not resolve REGEX_EXTRACT_ALL using imports: [,
> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
>
> It doesn't even like UPPER:
>
> grunt> *C = FOREACH A GENERATE UPPER(code);   *
> 2011-01-13 17:37:51,317 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1070: Could not resolve UPPER using imports: [,
> org.apache.pig.builtin., org.apache.pig.impl.builtin.]
>
>
>
> On Wed, Jan 12, 2011 at 6:03 PM, Thejas M Nair <te...@yahoo-inc.com>
> wrote:
>
> >  The functions need to be part of an expression in a relational operator,
> > for example -
> >
> > f = foreach l generate  REGEX_EXTRACT_ALL('192.168.1.5:8020',
> > '(.*)\\:(.*)');
> >
> > (the above example does not make much sense as none of the columns in
> input
> > relation are being used.)
> >
> > -Thejas
> >
> >
> >
> > On 1/12/11 5:43 PM, "Dexin Wang" <wa...@gmail.com> wrote:
> >
> > I see there are some builtin string functions, but I don't know how to
> use
> > them. I got this error when I follow the examples:
> >
> > grunt> REGEX_EXTRACT_ALL('192.168.1.5:8020', '(.*)\:(.*)');
> > 2011-01-12 19:34:23,773 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> > ERROR 1000: Error during parsing. Encountered " <IDENTIFIER>
> > "REGEX_EXTRACT_ALL "" at line 1, column 1.
> >
> > Thanks.
> >
> >
> >
>

Re: how to use builtin String functions

Posted by Dexin Wang <wa...@gmail.com>.
Thanks.

Somehow, it's not recognizing these functions.

grunt> *DUMP A;*
(a-b-c,1)
(x-y,2)
(z,3)
grunt> *DESCRIBE A;*
A: {code: chararray,v: int}
grunt> *B = FOREACH A GENERATE REGEX_EXTRACT_ALL(code, '(.*)-(.*)');*
2011-01-13 17:35:48,062 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1070: Could not resolve REGEX_EXTRACT_ALL using imports: [,
org.apache.pig.builtin., org.apache.pig.impl.builtin.]

It doesn't even like UPPER:

grunt> *C = FOREACH A GENERATE UPPER(code);   *
2011-01-13 17:37:51,317 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1070: Could not resolve UPPER using imports: [,
org.apache.pig.builtin., org.apache.pig.impl.builtin.]



On Wed, Jan 12, 2011 at 6:03 PM, Thejas M Nair <te...@yahoo-inc.com> wrote:

>  The functions need to be part of an expression in a relational operator,
> for example -
>
> f = foreach l generate  REGEX_EXTRACT_ALL('192.168.1.5:8020',
> '(.*)\\:(.*)');
>
> (the above example does not make much sense as none of the columns in input
> relation are being used.)
>
> -Thejas
>
>
>
> On 1/12/11 5:43 PM, "Dexin Wang" <wa...@gmail.com> wrote:
>
> I see there are some builtin string functions, but I don't know how to use
> them. I got this error when I follow the examples:
>
> grunt> REGEX_EXTRACT_ALL('192.168.1.5:8020', '(.*)\:(.*)');
> 2011-01-12 19:34:23,773 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1000: Error during parsing. Encountered " <IDENTIFIER>
> "REGEX_EXTRACT_ALL "" at line 1, column 1.
>
> Thanks.
>
>
>

Re: how to use builtin String functions

Posted by Thejas M Nair <te...@yahoo-inc.com>.
The functions need to be part of an expression in a relational operator, for example -

f = foreach l generate  REGEX_EXTRACT_ALL('192.168.1.5:8020', '(.*)\\:(.*)');

(the above example does not make much sense as none of the columns in input relation are being used.)

-Thejas


On 1/12/11 5:43 PM, "Dexin Wang" <wa...@gmail.com> wrote:

I see there are some builtin string functions, but I don't know how to use
them. I got this error when I follow the examples:

grunt> REGEX_EXTRACT_ALL('192.168.1.5:8020', '(.*)\:(.*)');
2011-01-12 19:34:23,773 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1000: Error during parsing. Encountered " <IDENTIFIER>
"REGEX_EXTRACT_ALL "" at line 1, column 1.

Thanks.