You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Michael Malak <mi...@yahoo.com> on 2013/01/18 19:49:19 UTC
Hard-coded inline relations
I'm new to Pig, and it looks like there is no provision to declare relations inline in a Pig script (without LOADing from an external file)?
Based on
http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
I would have thought the following would constitute "Hello World" for Pig:
A = {('Hello'),('World')};
DUMP A;
But I get a syntax error. The ability to inline relations would be useful for debugging. Is this limitation by design, or is it just not implemented yet?
Re: Hard-coded inline relations
Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Ok you can build it from my EB branch:
https://github.com/dvryaboy/elephant-bird/tree/add_locationtuple_loader
You will want to build the elephant-bird-pig package, and the loader is
com.twitter.elephantbird.pig.load.LocationAsTuple
Here's the javadoc:
/**
* Parses the "location" into a tuple by splitting on a delimiter, and
returns it.
* Handy for turning scalars into relations. For example:
* <pre>{@code
* languages = load 'en,fr,jp' using LocationAsTuple(',');
* -- languages is ('en', 'fr', 'jp')
* language_bag = foreach languages generate flatten(TOBAG(*));
* -- language_bag is a relation with three rows, ('en'), ('fr'), ('jp')
* }</pre>
*/
On Thu, Jan 24, 2013 at 1:03 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
> I have a loader that does exactly that. Let me see about dropping into
Elephant-Bird.
>
>
> On Thu, Jan 24, 2013 at 8:15 AM, Alan Gates <ga...@hortonworks.com> wrote:
>>
>> I agree this would be useful for debugging, but I'd go about it a
different way. Rather than add new syntax as you propose, it seems we
could easily create an inline loader, so your script would look something
like:
>>
>> A = load '{(Hello), (World)}' using InlineLoader();
>> dump A;
>>
>> Alan.
>>
>> On Jan 18, 2013, at 10:49 AM, Michael Malak wrote:
>>
>> > I'm new to Pig, and it looks like there is no provision to declare
relations inline in a Pig script (without LOADing from an external file)?
>> >
>> > Based on
>> > http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
>> > I would have thought the following would constitute "Hello World" for
Pig:
>> >
>> > A = {('Hello'),('World')};
>> > DUMP A;
>> >
>> > But I get a syntax error. The ability to inline relations would be
useful for debugging. Is this limitation by design, or is it just not
implemented yet?
>> >
>>
>
Re: Hard-coded inline relations
Posted by Dmitriy Ryaboy <dv...@gmail.com>.
I have a loader that does exactly that. Let me see about dropping into
Elephant-Bird.
On Thu, Jan 24, 2013 at 8:15 AM, Alan Gates <ga...@hortonworks.com> wrote:
> I agree this would be useful for debugging, but I'd go about it a
> different way. Rather than add new syntax as you propose, it seems we
> could easily create an inline loader, so your script would look something
> like:
>
> A = load '{(Hello), (World)}' using InlineLoader();
> dump A;
>
> Alan.
>
> On Jan 18, 2013, at 10:49 AM, Michael Malak wrote:
>
> > I'm new to Pig, and it looks like there is no provision to declare
> relations inline in a Pig script (without LOADing from an external file)?
> >
> > Based on
> > http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
> > I would have thought the following would constitute "Hello World" for
> Pig:
> >
> > A = {('Hello'),('World')};
> > DUMP A;
> >
> > But I get a syntax error. The ability to inline relations would be
> useful for debugging. Is this limitation by design, or is it just not
> implemented yet?
> >
>
>
Re: Hard-coded inline relations
Posted by Alan Gates <ga...@hortonworks.com>.
I agree this would be useful for debugging, but I'd go about it a different way. Rather than add new syntax as you propose, it seems we could easily create an inline loader, so your script would look something like:
A = load '{(Hello), (World)}' using InlineLoader();
dump A;
Alan.
On Jan 18, 2013, at 10:49 AM, Michael Malak wrote:
> I'm new to Pig, and it looks like there is no provision to declare relations inline in a Pig script (without LOADing from an external file)?
>
> Based on
> http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#Constants
> I would have thought the following would constitute "Hello World" for Pig:
>
> A = {('Hello'),('World')};
> DUMP A;
>
> But I get a syntax error. The ability to inline relations would be useful for debugging. Is this limitation by design, or is it just not implemented yet?
>