You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by lei tang <fi...@gmail.com> on 2012/09/29 01:05:56 UTC

regular expression as delimiter in PigStorage?

Hi,

Is it possible to use a regular expression as a delimiter to load a data,
say sth. like
A = load 'data' using PigStorage('\s+');

However, by checking the doc, it seems that only one character is accepted
as the delimiter.
http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/PigStorage.html

Just wondering whether there is any way to achieve similar goal  as the
command above.  BTW,  I'm using Pig 0.10.0.

Thanks,
- Lei

Re: regular expression as delimiter in PigStorage?

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Hi Lei,
This is currently not supported.
However one can always create a new loadfunc and implement his own parsing
(perhaps by extending PigStorage and overriding the parsing bits).

D

On Fri, Sep 28, 2012 at 4:05 PM, lei tang <fi...@gmail.com> wrote:

> Hi,
>
> Is it possible to use a regular expression as a delimiter to load a data,
> say sth. like
> A = load 'data' using PigStorage('\s+');
>
> However, by checking the doc, it seems that only one character is accepted
> as the delimiter.
>
> http://pig.apache.org/docs/r0.10.0/api/org/apache/pig/builtin/PigStorage.html
>
> Just wondering whether there is any way to achieve similar goal  as the
> command above.  BTW,  I'm using Pig 0.10.0.
>
> Thanks,
> - Lei
>