You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2010/12/16 21:11:02 UTC
[jira] Created: (PIG-1770) matches clause problem with chars that
have special meaning in dk.brics - #, @ ..
matches clause problem with chars that have special meaning in dk.brics - #, @ ..
---------------------------------------------------------------------------------
Key: PIG-1770
URL: https://issues.apache.org/jira/browse/PIG-1770
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.8.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
Fix For: 0.8.0
When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work.
This is related to PIG-965.
Example and workaround are as follows -
{code}
grunt> cat t.txt
asd#asdf
zxcasdf
2#asdf
grunt> l = load 't.txt' as (a : chararray);
grunt> f = filter l by (a matches '.*#.*');
grunt> dump f;
-- No output, though two rows are expected.
--As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
grunt> f = filter l by (a matches '.*\\#.*');
grunt> dump f;
asd#asdf
2#asdf
{code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-1770) matches clause problem with chars that
have special meaning in dk.brics - #, @ ..
Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair updated PIG-1770:
-------------------------------
Fix Version/s: 0.9.0
Status: Patch Available (was: Open)
> matches clause problem with chars that have special meaning in dk.brics - #, @ ..
> ---------------------------------------------------------------------------------
>
> Key: PIG-1770
> URL: https://issues.apache.org/jira/browse/PIG-1770
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Thejas M Nair
> Fix For: 0.9.0, 0.8.0
>
> Attachments: PIG-1770.1.patch
>
>
> When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work.
> This is related to PIG-965.
> Example and workaround are as follows -
> {code}
> grunt> cat t.txt
> asd#asdf
> zxcasdf
> 2#asdf
> grunt> l = load 't.txt' as (a : chararray);
> grunt> f = filter l by (a matches '.*#.*');
> grunt> dump f;
> -- No output, though two rows are expected.
> --As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
> grunt> f = filter l by (a matches '.*\\#.*');
> grunt> dump f;
> asd#asdf
> 2#asdf
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1770) matches clause problem with chars that
have special meaning in dk.brics - #, @ ..
Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair updated PIG-1770:
-------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Patch committed to 0.8 branch and trunk.
> matches clause problem with chars that have special meaning in dk.brics - #, @ ..
> ---------------------------------------------------------------------------------
>
> Key: PIG-1770
> URL: https://issues.apache.org/jira/browse/PIG-1770
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Thejas M Nair
> Fix For: 0.9.0, 0.8.0
>
> Attachments: PIG-1770.1.patch
>
>
> When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work.
> This is related to PIG-965.
> Example and workaround are as follows -
> {code}
> grunt> cat t.txt
> asd#asdf
> zxcasdf
> 2#asdf
> grunt> l = load 't.txt' as (a : chararray);
> grunt> f = filter l by (a matches '.*#.*');
> grunt> dump f;
> -- No output, though two rows are expected.
> --As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
> grunt> f = filter l by (a matches '.*\\#.*');
> grunt> dump f;
> asd#asdf
> 2#asdf
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (PIG-1770) matches clause problem with chars that
have special meaning in dk.brics - #, @ ..
Posted by "Thejas M Nair (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair updated PIG-1770:
-------------------------------
Attachment: PIG-1770.1.patch
PIG-1770.1.patch - With this patch the regex optional patterns in the automaton library are disabled so that pattern matching works as expected.
Unit tests pass. Output of test-patch -
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 3 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
> matches clause problem with chars that have special meaning in dk.brics - #, @ ..
> ---------------------------------------------------------------------------------
>
> Key: PIG-1770
> URL: https://issues.apache.org/jira/browse/PIG-1770
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Thejas M Nair
> Fix For: 0.8.0, 0.9.0
>
> Attachments: PIG-1770.1.patch
>
>
> When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work.
> This is related to PIG-965.
> Example and workaround are as follows -
> {code}
> grunt> cat t.txt
> asd#asdf
> zxcasdf
> 2#asdf
> grunt> l = load 't.txt' as (a : chararray);
> grunt> f = filter l by (a matches '.*#.*');
> grunt> dump f;
> -- No output, though two rows are expected.
> --As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
> grunt> f = filter l by (a matches '.*\\#.*');
> grunt> dump f;
> asd#asdf
> 2#asdf
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (PIG-1770) matches clause problem with chars that
have special meaning in dk.brics - #, @ ..
Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005806#comment-13005806 ]
Richard Ding commented on PIG-1770:
-----------------------------------
+1
> matches clause problem with chars that have special meaning in dk.brics - #, @ ..
> ---------------------------------------------------------------------------------
>
> Key: PIG-1770
> URL: https://issues.apache.org/jira/browse/PIG-1770
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Thejas M Nair
> Fix For: 0.8.0, 0.9.0
>
> Attachments: PIG-1770.1.patch
>
>
> When special chars #, @ , and the 'optional' patterns described here - http://www.brics.dk/automaton/doc/dk/brics/automaton/RegExp.html#RegExp%28java.lang.String%29 are used , the regex match fails to work.
> This is related to PIG-965.
> Example and workaround are as follows -
> {code}
> grunt> cat t.txt
> asd#asdf
> zxcasdf
> 2#asdf
> grunt> l = load 't.txt' as (a : chararray);
> grunt> f = filter l by (a matches '.*#.*');
> grunt> dump f;
> -- No output, though two rows are expected.
> --As a workaround, add a \ to escape the # . This regex is valid even in 0.7 , and it will be even after this bug is fixed (its valid java regex, which has same meaning as above regex).
> grunt> f = filter l by (a matches '.*\\#.*');
> grunt> dump f;
> asd#asdf
> 2#asdf
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira