You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Eli Finkelshteyn <ie...@gmail.com> on 2012/02/16 18:50:52 UTC

PIG Regex Problems

Hi,
I'm trying to do a pretty simple regex test in PIG right now and getting 
a weird error. All I'm doing is:

orig_set = load '/data/dictionaries/Eng-Spa.dic' USING PigStorage('\t') 
AS (orig: CHARARRAY, trans: CHARARRAY);
filtered = FILTER orig_set BY REGEX_EXTRACT(orig, '^[\\#\\<]') == 1;

The error I get is:
2012-02-16 12:45:24,000 [main] ERROR org.apache.pig.tools.grunt.Grunt - 
ERROR 1045: Could not infer the matching function for 
org.apache.pig.builtin.REGEX_EXTRACT as multiple or none of them fit. 
Please use an explicit cast

Ideas?

Cheers,
Eli

Re: PIG Regex Problems

Posted by Eli Finkelshteyn <ie...@gmail.com>.
Cool, actually, I just got what I wanted to work like this:

filtered = FILTER orig_set BY orig MATCHES  '^[\\#\\<].*';

I didn't know MATCHES worked for regex before. Sweet!

Eli

On 2/16/12 12:54 PM, Grig Gheorghiu wrote:
> Can you try with RegexMatch? I am doing something similar in one of my
> scripts and it works fine.
>
> Grig
>
> On Thu, Feb 16, 2012 at 9:50 AM, Eli Finkelshteyn<ie...@gmail.com>  wrote:
>> Hi,
>> I'm trying to do a pretty simple regex test in PIG right now and getting a
>> weird error. All I'm doing is:
>>
>> orig_set = load '/data/dictionaries/Eng-Spa.dic' USING PigStorage('\t') AS
>> (orig: CHARARRAY, trans: CHARARRAY);
>> filtered = FILTER orig_set BY REGEX_EXTRACT(orig, '^[\\#\\<]') == 1;
>>
>> The error I get is:
>> 2012-02-16 12:45:24,000 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>> ERROR 1045: Could not infer the matching function for
>> org.apache.pig.builtin.REGEX_EXTRACT as multiple or none of them fit. Please
>> use an explicit cast
>>
>> Ideas?
>>
>> Cheers,
>> Eli


Re: PIG Regex Problems

Posted by Grig Gheorghiu <gr...@gmail.com>.
Can you try with RegexMatch? I am doing something similar in one of my
scripts and it works fine.

Grig

On Thu, Feb 16, 2012 at 9:50 AM, Eli Finkelshteyn <ie...@gmail.com> wrote:
> Hi,
> I'm trying to do a pretty simple regex test in PIG right now and getting a
> weird error. All I'm doing is:
>
> orig_set = load '/data/dictionaries/Eng-Spa.dic' USING PigStorage('\t') AS
> (orig: CHARARRAY, trans: CHARARRAY);
> filtered = FILTER orig_set BY REGEX_EXTRACT(orig, '^[\\#\\<]') == 1;
>
> The error I get is:
> 2012-02-16 12:45:24,000 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1045: Could not infer the matching function for
> org.apache.pig.builtin.REGEX_EXTRACT as multiple or none of them fit. Please
> use an explicit cast
>
> Ideas?
>
> Cheers,
> Eli