You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Michael Malak <mi...@yahoo.com> on 2013/01/18 19:49:19 UTC

Hard-coded inline relations

I'm new to Pig, and it looks like there is no provision to declare relations inline in a Pig script (without LOADing from an external file)?

Based on
http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
I would have thought the following would constitute "Hello World" for Pig:

A = {('Hello'),('World')};
DUMP A;

But I get a syntax error.  The ability to inline relations would be useful for debugging.  Is this limitation by design, or is it just not implemented yet?


Re: Hard-coded inline relations

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Ok you can build it from my EB branch:

https://github.com/dvryaboy/elephant-bird/tree/add_locationtuple_loader

You will want to build the elephant-bird-pig package, and the loader is

com.twitter.elephantbird.pig.load.LocationAsTuple

Here's the javadoc:

/**
 * Parses the "location" into a tuple by splitting on a delimiter, and
returns it.
 * Handy for turning scalars into relations. For example:
 * <pre>{@code
 * languages = load 'en,fr,jp' using LocationAsTuple(',');
 * -- languages is ('en', 'fr', 'jp')
 * language_bag = foreach languages generate flatten(TOBAG(*));
 * -- language_bag is a relation with three rows, ('en'), ('fr'), ('jp')
 * }</pre>
 */




On Thu, Jan 24, 2013 at 1:03 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
> I have a loader that does exactly that. Let me see about dropping into
Elephant-Bird.
>
>
> On Thu, Jan 24, 2013 at 8:15 AM, Alan Gates <ga...@hortonworks.com> wrote:
>>
>> I agree this would be useful for debugging, but I'd go about it a
different way.  Rather than add new syntax as you propose, it seems we
could easily create an inline loader, so your script would look something
like:
>>
>> A = load '{(Hello), (World)}' using InlineLoader();
>> dump A;
>>
>> Alan.
>>
>> On Jan 18, 2013, at 10:49 AM, Michael Malak wrote:
>>
>> > I'm new to Pig, and it looks like there is no provision to declare
relations inline in a Pig script (without LOADing from an external file)?
>> >
>> > Based on
>> > http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
>> > I would have thought the following would constitute "Hello World" for
Pig:
>> >
>> > A = {('Hello'),('World')};
>> > DUMP A;
>> >
>> > But I get a syntax error.  The ability to inline relations would be
useful for debugging.  Is this limitation by design, or is it just not
implemented yet?
>> >
>>
>

Re: Hard-coded inline relations

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
I have a loader that does exactly that. Let me see about dropping into
Elephant-Bird.


On Thu, Jan 24, 2013 at 8:15 AM, Alan Gates <ga...@hortonworks.com> wrote:

> I agree this would be useful for debugging, but I'd go about it a
> different way.  Rather than add new syntax as you propose, it seems we
> could easily create an inline loader, so your script would look something
> like:
>
> A = load '{(Hello), (World)}' using InlineLoader();
> dump A;
>
> Alan.
>
> On Jan 18, 2013, at 10:49 AM, Michael Malak wrote:
>
> > I'm new to Pig, and it looks like there is no provision to declare
> relations inline in a Pig script (without LOADing from an external file)?
> >
> > Based on
> > http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
> > I would have thought the following would constitute "Hello World" for
> Pig:
> >
> > A = {('Hello'),('World')};
> > DUMP A;
> >
> > But I get a syntax error.  The ability to inline relations would be
> useful for debugging.  Is this limitation by design, or is it just not
> implemented yet?
> >
>
>

Re: Hard-coded inline relations

Posted by Alan Gates <ga...@hortonworks.com>.
I agree this would be useful for debugging, but I'd go about it a different way.  Rather than add new syntax as you propose, it seems we could easily create an inline loader, so your script would look something like:

A = load '{(Hello), (World)}' using InlineLoader();
dump A;

Alan.

On Jan 18, 2013, at 10:49 AM, Michael Malak wrote:

> I'm new to Pig, and it looks like there is no provision to declare relations inline in a Pig script (without LOADing from an external file)?
> 
> Based on
> http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
> I would have thought the following would constitute "Hello World" for Pig:
> 
> A = {('Hello'),('World')};
> DUMP A;
> 
> But I get a syntax error.  The ability to inline relations would be useful for debugging.  Is this limitation by design, or is it just not implemented yet?
>