You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by jr <jo...@io-consulting.net> on 2010/04/08 12:08:14 UTC
SPLIT with matches / not matches
Hello everybody,
I'm trying to split apache log data into two output sets, one for all
uris that match a certain criterie and one for uris that don't match a
certain criteria.
I've been trying
SPLIT list INTO downloads IF uri matches '/downloads/.*\\.exe', pages if
(NOT uri matches '/downloads/.*\\.exe');
but this doesn't seem to do the trick (downloads is fine, but pages
contains the downloads too :/)
Any hints on how to do this?
Johannes
Re: SPLIT with matches / not matches
Posted by Rekha Joshi <re...@yahoo-inc.com>.
I suppose there are some issues with split. It would be ideal to have your input set, but did you try with 2-step filter operation instead of split and got the correct answer?If not, I would look at the regex expr.
On 4/8/10 3:38 PM, "jr" <jo...@io-consulting.net> wrote:
Hello everybody,
I'm trying to split apache log data into two output sets, one for all
uris that match a certain criterie and one for uris that don't match a
certain criteria.
I've been trying
SPLIT list INTO downloads IF uri matches '/downloads/.*\\.exe', pages if
(NOT uri matches '/downloads/.*\\.exe');
but this doesn't seem to do the trick (downloads is fine, but pages
contains the downloads too :/)