You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2010/12/16 21:11:02 UTC

[jira] Created: (PIG-1770) matches clause problem with chars that have special meaning in dk.brics - #, @ ..

matches clause problem with chars that have special meaning in dk.brics - #, @ ..
---------------------------------------------------------------------------------

                 Key: PIG-1770
                 URL: https://issues.apache.org/jira/browse/PIG-1770
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.8.0
            Reporter: Thejas M Nair
            Assignee: Thejas M Nair
             Fix For: 0.8.0


When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work. 

This is related to  PIG-965.

Example and workaround are as follows -

{code}
grunt> cat t.txt                           
asd#asdf
zxcasdf
2#asdf

grunt> l = load 't.txt' as (a : chararray);
grunt> f = filter l by (a matches '.*#.*');
grunt> dump f; 
-- No output, though two rows are expected.

--As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
grunt> f = filter l by (a matches '.*\\#.*');
grunt> dump f; 
asd#asdf
2#asdf
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1770) matches clause problem with chars that have special meaning in dk.brics - #, @ ..

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1770:
-------------------------------

    Fix Version/s: 0.9.0
           Status: Patch Available  (was: Open)

> matches clause problem with chars that have special meaning in dk.brics - #, @ ..
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-1770
>                 URL: https://issues.apache.org/jira/browse/PIG-1770
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.9.0, 0.8.0
>
>         Attachments: PIG-1770.1.patch
>
>
> When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work. 
> This is related to  PIG-965.
> Example and workaround are as follows -
> {code}
> grunt> cat t.txt                           
> asd#asdf
> zxcasdf
> 2#asdf
> grunt> l = load 't.txt' as (a : chararray);
> grunt> f = filter l by (a matches '.*#.*');
> grunt> dump f; 
> -- No output, though two rows are expected.
> --As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
> grunt> f = filter l by (a matches '.*\\#.*');
> grunt> dump f; 
> asd#asdf
> 2#asdf
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (PIG-1770) matches clause problem with chars that have special meaning in dk.brics - #, @ ..

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1770:
-------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Patch committed to 0.8 branch and trunk.


> matches clause problem with chars that have special meaning in dk.brics - #, @ ..
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-1770
>                 URL: https://issues.apache.org/jira/browse/PIG-1770
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.9.0, 0.8.0
>
>         Attachments: PIG-1770.1.patch
>
>
> When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work. 
> This is related to  PIG-965.
> Example and workaround are as follows -
> {code}
> grunt> cat t.txt                           
> asd#asdf
> zxcasdf
> 2#asdf
> grunt> l = load 't.txt' as (a : chararray);
> grunt> f = filter l by (a matches '.*#.*');
> grunt> dump f; 
> -- No output, though two rows are expected.
> --As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
> grunt> f = filter l by (a matches '.*\\#.*');
> grunt> dump f; 
> asd#asdf
> 2#asdf
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (PIG-1770) matches clause problem with chars that have special meaning in dk.brics - #, @ ..

Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Thejas M Nair updated PIG-1770:
-------------------------------

    Attachment: PIG-1770.1.patch

PIG-1770.1.patch - With this patch the regex optional patterns in the automaton library are disabled so that pattern matching works as expected. 
Unit tests pass. Output of test-patch -
     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 3 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.


> matches clause problem with chars that have special meaning in dk.brics - #, @ ..
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-1770
>                 URL: https://issues.apache.org/jira/browse/PIG-1770
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0, 0.9.0
>
>         Attachments: PIG-1770.1.patch
>
>
> When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work. 
> This is related to  PIG-965.
> Example and workaround are as follows -
> {code}
> grunt> cat t.txt                           
> asd#asdf
> zxcasdf
> 2#asdf
> grunt> l = load 't.txt' as (a : chararray);
> grunt> f = filter l by (a matches '.*#.*');
> grunt> dump f; 
> -- No output, though two rows are expected.
> --As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
> grunt> f = filter l by (a matches '.*\\#.*');
> grunt> dump f; 
> asd#asdf
> 2#asdf
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (PIG-1770) matches clause problem with chars that have special meaning in dk.brics - #, @ ..

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005806#comment-13005806 ] 

Richard Ding commented on PIG-1770:
-----------------------------------

+1

> matches clause problem with chars that have special meaning in dk.brics - #, @ ..
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-1770
>                 URL: https://issues.apache.org/jira/browse/PIG-1770
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.8.0
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0, 0.9.0
>
>         Attachments: PIG-1770.1.patch
>
>
> When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work. 
> This is related to  PIG-965.
> Example and workaround are as follows -
> {code}
> grunt> cat t.txt                           
> asd#asdf
> zxcasdf
> 2#asdf
> grunt> l = load 't.txt' as (a : chararray);
> grunt> f = filter l by (a matches '.*#.*');
> grunt> dump f; 
> -- No output, though two rows are expected.
> --As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
> grunt> f = filter l by (a matches '.*\\#.*');
> grunt> dump f; 
> asd#asdf
> 2#asdf
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira