You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Adrien Bidault (JIRA)" <ji...@apache.org> on 2015/04/15 17:20:58 UTC

[jira] [Created] (PIG-4507) Problem with REGEX which just match for the first word

Adrien Bidault created PIG-4507:
-----------------------------------

             Summary: Problem with REGEX which just match for the first word
                 Key: PIG-4507
                 URL: https://issues.apache.org/jira/browse/PIG-4507
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.12.0
         Environment: IBM Infosphere BigInsights v3.0.0.1
            Reporter: Adrien Bidault


I am trying to eliminate punctuation and special symbols from a string using REGEX of a type "(\\w+)". The problem is that this REGEX treatment is applied to the first word of the string only.

Example:
clean3 = FOREACH clean1 GENERATE id, REGEX_EXTRACT_ALL('toto,  likes ... to play ', '(\\w+)');
It just resturn "toto" instead of "toto likes to play"

Would you guys have any ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)