You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by sridhar basam <sr...@basam.org> on 2011/05/26 18:10:25 UTC

Pig Load function with variable tuple length

Hey,
       I have a file similar to syslog output. It is 1 tuple per line, space
seperated, but the tuple can have variable number of arguments if you use
the standard PigStorage function to load the file.
The first 4 variables are always defined and have a strict format, the rest
of the line i would like to define as a single chararray (including spaces).
Is there anyway for me to do that in pig?

thanks,
             Sridhar

Re: Pig Load function with variable tuple length

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
The simplest thing to do might be to use the simple TextLoader, and do
the parsing of the lines yourself, using either the various built-in
regex extraction functions, or a custom UDF.

On Thu, May 26, 2011 at 9:10 AM, sridhar basam <sr...@basam.org> wrote:
> Hey,
>       I have a file similar to syslog output. It is 1 tuple per line, space
> seperated, but the tuple can have variable number of arguments if you use
> the standard PigStorage function to load the file.
> The first 4 variables are always defined and have a strict format, the rest
> of the line i would like to define as a single chararray (including spaces).
> Is there anyway for me to do that in pig?
>
> thanks,
>             Sridhar
>