You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Prashant Kommireddi <pr...@gmail.com> on 2013/02/11 23:10:14 UTC

Injecting plans

Hey,

I wanted to run an idea by you guys. I have a use-case where I try
injecting load/store paths into the script. So if a user says A = load
'input'; I would like to add a base path to it and make it A = load
'base_path/foo/bar/input'.

I would like to achieve this programatically, and 1 way I can of doing it
is by allowing setter on LOLoad/LOStore that can modify the FileSpec. Is
there a cleaner or better way to approach this?

In the future I would probably be meddling with more than just Load/Store
(for eg, disabling rmf or mkdir commands etc) but this is something I am
looking at currently.

Thanks,
Prashant

Re: Injecting plans

Posted by Bill Graham <bi...@gmail.com>.
>
> 3. Framework has to parse the script to determine the user provided
> locations?


No, no parsing need. The setLocation method will get whatever location was
included in the STORE INTO 'foo' statement and it can then modify 'foo' to
be some entirely different thing. I'm still not clear on why this doesn't
meet your needs.


On Tue, Feb 12, 2013 at 9:24 AM, Julien Le Dem <ju...@twitter.com> wrote:

> Then I would point the same direction as Bill: override
>  relativeToAbsolutePath
> That's more or less what HCatalog does: You use a loader/storer that
> abstracts out where the data is. and based on the requested location you
> point to the right files.
> Out of curiosity: Why does your framework need to parse first and then
> modify the plan ? Are you trying to cache some intermediary results? Are
> you trying to change the logical or the physical plan?
> Julien
>
>
> On Tue, Feb 12, 2013 at 12:24 AM, Prashant Kommireddi <prash1784@gmail.com
> > wrote:
>
>> Thanks Bill. I think the issue would be with the fact that I am trying to
>> inject the property (base path) from an external framework into
>> Load/StoreFunc. An analogy could be a fancy random generator that resides
>> in a framework has to be used to setLocation. However, "random generator
>> logic" can not be placed within a Load/StoreFunc. So a user that uses, say
>> CustomPigStorage passes his script to the framework and the framework now
>> needs to alter the i/o locations.
>>
>>
>>    1. User submits script -> Framework.getService().submitPigScript(File
>>    script);
>>    2. Script is using a Load/StoreFunc
>>    3. Framework has to parse the script to determine the user provided
>>    locations?
>>
>> I understand when you say it could be done dynamically within Funcs, but
>> I guess I am not very clear on how we could alter i/o locations from
>> outside of Pig.
>> Pardon me if I am not explaining it well enough!
>>
>> Julien, this is not for testing or for injecting data.
>>
>>
>> On Mon, Feb 11, 2013 at 7:47 PM, Bill Graham <bi...@gmail.com>wrote:
>>
>>> You can do it dynamically in the StoreFunc/LoadFunc, but not before the
>>> LP
>>> is built (since the LP really isn't yet involved with the physical
>>> location). Why would you need to change the location that early in the
>>> process?
>>>
>>>
>>> On Mon, Feb 11, 2013 at 6:30 PM, Julien Le Dem <ju...@twitter.com>
>>> wrote:
>>>
>>> > Is it for testing or something totally different?
>>> > you can also look at o.a.p.builtin.mock.Storage for injecting data
>>> > Julien
>>> >
>>> >
>>> > On Mon, Feb 11, 2013 at 6:17 PM, Prashant Kommireddi <
>>> prash1784@gmail.com>wrote:
>>> >
>>> >> That would work in the case when it's a static path? The problem here
>>> is
>>> >> my
>>> >> base path will vary entirely for each different Pig job, and I would
>>> not
>>> >> like the client to set that. Rather my framework plugs-in the base
>>> path
>>> >> and
>>> >> I am guessing that can happen only after parsing/buildLP?
>>> >>
>>> >> On Mon, Feb 11, 2013 at 5:55 PM, Bill Graham <bi...@gmail.com>
>>> >> wrote:
>>> >>
>>> >> > We've done this before by overriding  relativeToAbsolutePath
>>> >> > and setLocation in the LoadFunc, or the correspoding methods in
>>> >> StoreFunc.
>>> >> >
>>> >> > On Mon, Feb 11, 2013 at 2:10 PM, Prashant Kommireddi <
>>> >> prash1784@gmail.com
>>> >> > >wrote:
>>> >> >
>>> >> > > Hey,
>>> >> > >
>>> >> > > I wanted to run an idea by you guys. I have a use-case where I try
>>> >> > > injecting load/store paths into the script. So if a user says A =
>>> load
>>> >> > > 'input'; I would like to add a base path to it and make it A =
>>> load
>>> >> > > 'base_path/foo/bar/input'.
>>> >> > >
>>> >> > > I would like to achieve this programatically, and 1 way I can of
>>> >> doing it
>>> >> > > is by allowing setter on LOLoad/LOStore that can modify the
>>> FileSpec.
>>> >> Is
>>> >> > > there a cleaner or better way to approach this?
>>> >> > >
>>> >> > > In the future I would probably be meddling with more than just
>>> >> Load/Store
>>> >> > > (for eg, disabling rmf or mkdir commands etc) but this is
>>> something I
>>> >> am
>>> >> > > looking at currently.
>>> >> > >
>>> >> > > Thanks,
>>> >> > > Prashant
>>> >> > >
>>> >> >
>>> >> >
>>> >> >
>>> >> > --
>>> >> > *Note that I'm no longer using my Yahoo! email address. Please
>>> email me
>>> >> at
>>> >> > billgraham@gmail.com going forward.*
>>> >> >
>>> >>
>>> >
>>> >
>>>
>>>
>>> --
>>> *Note that I'm no longer using my Yahoo! email address. Please email me
>>> at
>>> billgraham@gmail.com going forward.*
>>>
>>
>>
>


-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgraham@gmail.com going forward.*

Re: Injecting plans

Posted by Julien Le Dem <ju...@twitter.com>.
Then I would point the same direction as Bill: override
 relativeToAbsolutePath
That's more or less what HCatalog does: You use a loader/storer that
abstracts out where the data is. and based on the requested location you
point to the right files.
Out of curiosity: Why does your framework need to parse first and then
modify the plan ? Are you trying to cache some intermediary results? Are
you trying to change the logical or the physical plan?
Julien


On Tue, Feb 12, 2013 at 12:24 AM, Prashant Kommireddi
<pr...@gmail.com>wrote:

> Thanks Bill. I think the issue would be with the fact that I am trying to
> inject the property (base path) from an external framework into
> Load/StoreFunc. An analogy could be a fancy random generator that resides
> in a framework has to be used to setLocation. However, "random generator
> logic" can not be placed within a Load/StoreFunc. So a user that uses, say
> CustomPigStorage passes his script to the framework and the framework now
> needs to alter the i/o locations.
>
>
>    1. User submits script -> Framework.getService().submitPigScript(File
>    script);
>    2. Script is using a Load/StoreFunc
>    3. Framework has to parse the script to determine the user provided
>    locations?
>
> I understand when you say it could be done dynamically within Funcs, but I
> guess I am not very clear on how we could alter i/o locations from outside
> of Pig.
> Pardon me if I am not explaining it well enough!
>
> Julien, this is not for testing or for injecting data.
>
>
> On Mon, Feb 11, 2013 at 7:47 PM, Bill Graham <bi...@gmail.com> wrote:
>
>> You can do it dynamically in the StoreFunc/LoadFunc, but not before the LP
>> is built (since the LP really isn't yet involved with the physical
>> location). Why would you need to change the location that early in the
>> process?
>>
>>
>> On Mon, Feb 11, 2013 at 6:30 PM, Julien Le Dem <ju...@twitter.com>
>> wrote:
>>
>> > Is it for testing or something totally different?
>> > you can also look at o.a.p.builtin.mock.Storage for injecting data
>> > Julien
>> >
>> >
>> > On Mon, Feb 11, 2013 at 6:17 PM, Prashant Kommireddi <
>> prash1784@gmail.com>wrote:
>> >
>> >> That would work in the case when it's a static path? The problem here
>> is
>> >> my
>> >> base path will vary entirely for each different Pig job, and I would
>> not
>> >> like the client to set that. Rather my framework plugs-in the base path
>> >> and
>> >> I am guessing that can happen only after parsing/buildLP?
>> >>
>> >> On Mon, Feb 11, 2013 at 5:55 PM, Bill Graham <bi...@gmail.com>
>> >> wrote:
>> >>
>> >> > We've done this before by overriding  relativeToAbsolutePath
>> >> > and setLocation in the LoadFunc, or the correspoding methods in
>> >> StoreFunc.
>> >> >
>> >> > On Mon, Feb 11, 2013 at 2:10 PM, Prashant Kommireddi <
>> >> prash1784@gmail.com
>> >> > >wrote:
>> >> >
>> >> > > Hey,
>> >> > >
>> >> > > I wanted to run an idea by you guys. I have a use-case where I try
>> >> > > injecting load/store paths into the script. So if a user says A =
>> load
>> >> > > 'input'; I would like to add a base path to it and make it A = load
>> >> > > 'base_path/foo/bar/input'.
>> >> > >
>> >> > > I would like to achieve this programatically, and 1 way I can of
>> >> doing it
>> >> > > is by allowing setter on LOLoad/LOStore that can modify the
>> FileSpec.
>> >> Is
>> >> > > there a cleaner or better way to approach this?
>> >> > >
>> >> > > In the future I would probably be meddling with more than just
>> >> Load/Store
>> >> > > (for eg, disabling rmf or mkdir commands etc) but this is
>> something I
>> >> am
>> >> > > looking at currently.
>> >> > >
>> >> > > Thanks,
>> >> > > Prashant
>> >> > >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > *Note that I'm no longer using my Yahoo! email address. Please email
>> me
>> >> at
>> >> > billgraham@gmail.com going forward.*
>> >> >
>> >>
>> >
>> >
>>
>>
>> --
>> *Note that I'm no longer using my Yahoo! email address. Please email me at
>> billgraham@gmail.com going forward.*
>>
>
>

Re: Injecting plans

Posted by Prashant Kommireddi <pr...@gmail.com>.
Thanks Bill. I think the issue would be with the fact that I am trying to
inject the property (base path) from an external framework into
Load/StoreFunc. An analogy could be a fancy random generator that resides
in a framework has to be used to setLocation. However, "random generator
logic" can not be placed within a Load/StoreFunc. So a user that uses, say
CustomPigStorage passes his script to the framework and the framework now
needs to alter the i/o locations.


   1. User submits script -> Framework.getService().submitPigScript(File
   script);
   2. Script is using a Load/StoreFunc
   3. Framework has to parse the script to determine the user provided
   locations?

I understand when you say it could be done dynamically within Funcs, but I
guess I am not very clear on how we could alter i/o locations from outside
of Pig.
Pardon me if I am not explaining it well enough!

Julien, this is not for testing or for injecting data.


On Mon, Feb 11, 2013 at 7:47 PM, Bill Graham <bi...@gmail.com> wrote:

> You can do it dynamically in the StoreFunc/LoadFunc, but not before the LP
> is built (since the LP really isn't yet involved with the physical
> location). Why would you need to change the location that early in the
> process?
>
>
> On Mon, Feb 11, 2013 at 6:30 PM, Julien Le Dem <ju...@twitter.com> wrote:
>
> > Is it for testing or something totally different?
> > you can also look at o.a.p.builtin.mock.Storage for injecting data
> > Julien
> >
> >
> > On Mon, Feb 11, 2013 at 6:17 PM, Prashant Kommireddi <
> prash1784@gmail.com>wrote:
> >
> >> That would work in the case when it's a static path? The problem here is
> >> my
> >> base path will vary entirely for each different Pig job, and I would not
> >> like the client to set that. Rather my framework plugs-in the base path
> >> and
> >> I am guessing that can happen only after parsing/buildLP?
> >>
> >> On Mon, Feb 11, 2013 at 5:55 PM, Bill Graham <bi...@gmail.com>
> >> wrote:
> >>
> >> > We've done this before by overriding  relativeToAbsolutePath
> >> > and setLocation in the LoadFunc, or the correspoding methods in
> >> StoreFunc.
> >> >
> >> > On Mon, Feb 11, 2013 at 2:10 PM, Prashant Kommireddi <
> >> prash1784@gmail.com
> >> > >wrote:
> >> >
> >> > > Hey,
> >> > >
> >> > > I wanted to run an idea by you guys. I have a use-case where I try
> >> > > injecting load/store paths into the script. So if a user says A =
> load
> >> > > 'input'; I would like to add a base path to it and make it A = load
> >> > > 'base_path/foo/bar/input'.
> >> > >
> >> > > I would like to achieve this programatically, and 1 way I can of
> >> doing it
> >> > > is by allowing setter on LOLoad/LOStore that can modify the
> FileSpec.
> >> Is
> >> > > there a cleaner or better way to approach this?
> >> > >
> >> > > In the future I would probably be meddling with more than just
> >> Load/Store
> >> > > (for eg, disabling rmf or mkdir commands etc) but this is something
> I
> >> am
> >> > > looking at currently.
> >> > >
> >> > > Thanks,
> >> > > Prashant
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > *Note that I'm no longer using my Yahoo! email address. Please email
> me
> >> at
> >> > billgraham@gmail.com going forward.*
> >> >
> >>
> >
> >
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> billgraham@gmail.com going forward.*
>

Re: Injecting plans

Posted by Bill Graham <bi...@gmail.com>.
You can do it dynamically in the StoreFunc/LoadFunc, but not before the LP
is built (since the LP really isn't yet involved with the physical
location). Why would you need to change the location that early in the
process?


On Mon, Feb 11, 2013 at 6:30 PM, Julien Le Dem <ju...@twitter.com> wrote:

> Is it for testing or something totally different?
> you can also look at o.a.p.builtin.mock.Storage for injecting data
> Julien
>
>
> On Mon, Feb 11, 2013 at 6:17 PM, Prashant Kommireddi <pr...@gmail.com>wrote:
>
>> That would work in the case when it's a static path? The problem here is
>> my
>> base path will vary entirely for each different Pig job, and I would not
>> like the client to set that. Rather my framework plugs-in the base path
>> and
>> I am guessing that can happen only after parsing/buildLP?
>>
>> On Mon, Feb 11, 2013 at 5:55 PM, Bill Graham <bi...@gmail.com>
>> wrote:
>>
>> > We've done this before by overriding  relativeToAbsolutePath
>> > and setLocation in the LoadFunc, or the correspoding methods in
>> StoreFunc.
>> >
>> > On Mon, Feb 11, 2013 at 2:10 PM, Prashant Kommireddi <
>> prash1784@gmail.com
>> > >wrote:
>> >
>> > > Hey,
>> > >
>> > > I wanted to run an idea by you guys. I have a use-case where I try
>> > > injecting load/store paths into the script. So if a user says A = load
>> > > 'input'; I would like to add a base path to it and make it A = load
>> > > 'base_path/foo/bar/input'.
>> > >
>> > > I would like to achieve this programatically, and 1 way I can of
>> doing it
>> > > is by allowing setter on LOLoad/LOStore that can modify the FileSpec.
>> Is
>> > > there a cleaner or better way to approach this?
>> > >
>> > > In the future I would probably be meddling with more than just
>> Load/Store
>> > > (for eg, disabling rmf or mkdir commands etc) but this is something I
>> am
>> > > looking at currently.
>> > >
>> > > Thanks,
>> > > Prashant
>> > >
>> >
>> >
>> >
>> > --
>> > *Note that I'm no longer using my Yahoo! email address. Please email me
>> at
>> > billgraham@gmail.com going forward.*
>> >
>>
>
>


-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgraham@gmail.com going forward.*

Re: Injecting plans

Posted by Julien Le Dem <ju...@twitter.com>.
Is it for testing or something totally different?
you can also look at o.a.p.builtin.mock.Storage for injecting data
Julien


On Mon, Feb 11, 2013 at 6:17 PM, Prashant Kommireddi <pr...@gmail.com>wrote:

> That would work in the case when it's a static path? The problem here is my
> base path will vary entirely for each different Pig job, and I would not
> like the client to set that. Rather my framework plugs-in the base path and
> I am guessing that can happen only after parsing/buildLP?
>
> On Mon, Feb 11, 2013 at 5:55 PM, Bill Graham <bi...@gmail.com> wrote:
>
> > We've done this before by overriding  relativeToAbsolutePath
> > and setLocation in the LoadFunc, or the correspoding methods in
> StoreFunc.
> >
> > On Mon, Feb 11, 2013 at 2:10 PM, Prashant Kommireddi <
> prash1784@gmail.com
> > >wrote:
> >
> > > Hey,
> > >
> > > I wanted to run an idea by you guys. I have a use-case where I try
> > > injecting load/store paths into the script. So if a user says A = load
> > > 'input'; I would like to add a base path to it and make it A = load
> > > 'base_path/foo/bar/input'.
> > >
> > > I would like to achieve this programatically, and 1 way I can of doing
> it
> > > is by allowing setter on LOLoad/LOStore that can modify the FileSpec.
> Is
> > > there a cleaner or better way to approach this?
> > >
> > > In the future I would probably be meddling with more than just
> Load/Store
> > > (for eg, disabling rmf or mkdir commands etc) but this is something I
> am
> > > looking at currently.
> > >
> > > Thanks,
> > > Prashant
> > >
> >
> >
> >
> > --
> > *Note that I'm no longer using my Yahoo! email address. Please email me
> at
> > billgraham@gmail.com going forward.*
> >
>

Re: Injecting plans

Posted by Prashant Kommireddi <pr...@gmail.com>.
That would work in the case when it's a static path? The problem here is my
base path will vary entirely for each different Pig job, and I would not
like the client to set that. Rather my framework plugs-in the base path and
I am guessing that can happen only after parsing/buildLP?

On Mon, Feb 11, 2013 at 5:55 PM, Bill Graham <bi...@gmail.com> wrote:

> We've done this before by overriding  relativeToAbsolutePath
> and setLocation in the LoadFunc, or the correspoding methods in StoreFunc.
>
> On Mon, Feb 11, 2013 at 2:10 PM, Prashant Kommireddi <prash1784@gmail.com
> >wrote:
>
> > Hey,
> >
> > I wanted to run an idea by you guys. I have a use-case where I try
> > injecting load/store paths into the script. So if a user says A = load
> > 'input'; I would like to add a base path to it and make it A = load
> > 'base_path/foo/bar/input'.
> >
> > I would like to achieve this programatically, and 1 way I can of doing it
> > is by allowing setter on LOLoad/LOStore that can modify the FileSpec. Is
> > there a cleaner or better way to approach this?
> >
> > In the future I would probably be meddling with more than just Load/Store
> > (for eg, disabling rmf or mkdir commands etc) but this is something I am
> > looking at currently.
> >
> > Thanks,
> > Prashant
> >
>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> billgraham@gmail.com going forward.*
>

Re: Injecting plans

Posted by Bill Graham <bi...@gmail.com>.
We've done this before by overriding  relativeToAbsolutePath
and setLocation in the LoadFunc, or the correspoding methods in StoreFunc.

On Mon, Feb 11, 2013 at 2:10 PM, Prashant Kommireddi <pr...@gmail.com>wrote:

> Hey,
>
> I wanted to run an idea by you guys. I have a use-case where I try
> injecting load/store paths into the script. So if a user says A = load
> 'input'; I would like to add a base path to it and make it A = load
> 'base_path/foo/bar/input'.
>
> I would like to achieve this programatically, and 1 way I can of doing it
> is by allowing setter on LOLoad/LOStore that can modify the FileSpec. Is
> there a cleaner or better way to approach this?
>
> In the future I would probably be meddling with more than just Load/Store
> (for eg, disabling rmf or mkdir commands etc) but this is something I am
> looking at currently.
>
> Thanks,
> Prashant
>



-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgraham@gmail.com going forward.*