You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by zjffdu <zj...@gmail.com> on 2009/11/22 22:58:55 UTC

RE: more advanced string comparisons

Aaron,

Why not using UDF for the substring comparison ?


Jeff Zhang


-----Original Message-----
From: Aaron Kimball [mailto:aaron@cloudera.com] 
Sent: 2009年6月2日 19:41
To: pig-user@hadoop.apache.org
Subject: more advanced string comparisons

Hi Pig mavens,

I'm curious what my options are for more flexible options regarding
string-based data comparison. I need to check whether a substring of one
field is equivalent to another field.

I can "FILTER (record) BY string_field1 == string_field2," or use the !=
operator. But these require using the entire field. I can also use the
'matches' operator -- but the RHS of the matches operator must be a quoted
string, it seems. So I can't dynamically do something like: field2 matches
CONCAT(field1, '.*').

Are there operators for working with substrings, string length, or other
partial comparison features?

Thanks,
- Aaron