You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Adrien Bidault (JIRA)" <ji...@apache.org> on 2015/04/15 17:20:58 UTC
[jira] [Created] (PIG-4507) Problem with REGEX which just match for
the first word
Adrien Bidault created PIG-4507:
-----------------------------------
Summary: Problem with REGEX which just match for the first word
Key: PIG-4507
URL: https://issues.apache.org/jira/browse/PIG-4507
Project: Pig
Issue Type: Bug
Affects Versions: 0.12.0
Environment: IBM Infosphere BigInsights v3.0.0.1
Reporter: Adrien Bidault
I am trying to eliminate punctuation and special symbols from a string using REGEX of a type "(\\w+)". The problem is that this REGEX treatment is applied to the first word of the string only.
Example:
clean3 = FOREACH clean1 GENERATE id, REGEX_EXTRACT_ALL('toto, likes ... to play ', '(\\w+)');
It just resturn "toto" instead of "toto likes to play"
Would you guys have any ideas?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)