You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Russell Jurney <ru...@gmail.com> on 2012/03/02 01:19:49 UTC

Turn off Speculative Execution in a UDF?

Can you turn off speculative execution in a StoreFunc?  I believe it is
leading to duplicates in MongoStorage();

-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Re: Turn off Speculative Execution in a UDF?

Posted by Alan Gates <ga...@hortonworks.com>.
Store functions can run in either map or reduce depending on your script.  If your script has any operator that requires a reduce (most joins, group by, order by, distinct, limit) then the store function will be in a reduce.

Alan.

On Jan 11, 2013, at 9:14 AM, Corbin Hoenes wrote:

> Hi all,
> 
> I am a little unclear about which speculative execution you must disable.
> What phase do storage functions run at map or reduce?
> 
> I've always just done both.
> set mapred.reduce.tasks.speculative.execution false
> set mapred.map.tasks.speculative.execution false
> 
> Thanks for any hints!
> 
> On Fri, Mar 2, 2012 at 4:18 PM, Bill Graham <bi...@gmail.com> wrote:
> 
>> I tried to test turning this off in the setStoreLocation method but without
>> that change I wasn't able to get a job to run with SE happening. As a
>> result I can't verify that the setting is doing anything. Russell, if you
>> can reproduce SE I'd be curious to hear if you could turn it off in
>> setStoreLocation.
>> 
>> On Fri, Mar 2, 2012 at 2:40 PM, Russell Jurney <russell.jurney@gmail.com
>>> wrote:
>> 
>>> I thought it was too late in the workflow to do this, but it would be
>>> really cool if you could.  I don't think to think about MapReduce much
>> when
>>> I Pig, except to group my scripts by jobs... so this was a surprise for
>> me.
>>> Made sense once i thought of it.  But it was a surprise.
>>> 
>>> 
>>> On Fri, Mar 2, 2012 at 1:49 PM, Bill Graham <bi...@gmail.com>
>> wrote:
>>> 
>>>> I was also curious about this and will try it, but my initial thought
>> was
>>>> that at that point it might be tool late in the workflow of the job.
>> I'll
>>>> give it a shot and report back.
>>>> 
>>>> 
>>>> On Fri, Mar 2, 2012 at 1:45 PM, Dmitriy Ryaboy <dv...@gmail.com>
>>>> wrote:
>>>> 
>>>>> In a StoreFunc, you could do that when you get passed the jobconf,
>>>> right?
>>>>> 
>>>>> On Thu, Mar 1, 2012 at 9:37 PM, Bill Graham <bi...@gmail.com>
>>>> wrote:
>>>>>> I don't think so. We just do it in the pig script before using the
>>>> store
>>>>>> func:
>>>>>> 
>>>>>> SET mapred.map.tasks.speculative.execution false
>>>>>> 
>>>>>> 
>>>>>> On Thu, Mar 1, 2012 at 4:19 PM, Russell Jurney <
>>>> russell.jurney@gmail.com
>>>>>> wrote:
>>>>>> 
>>>>>>> Can you turn off speculative execution in a StoreFunc?  I believe
>> it
>>>> is
>>>>>>> leading to duplicates in MongoStorage();
>>>>>>> 
>>>>>>> --
>>>>>>> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
>>>>>>> datasyndrome.com
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> *Note that I'm no longer using my Yahoo! email address. Please email
>>>> me
>>>>> at
>>>>>> billgraham@gmail.com going forward.*
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> *Note that I'm no longer using my Yahoo! email address. Please email me
>> at
>>>> billgraham@gmail.com going forward.*
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Russell Jurney twitter.com/rjurney russell.jurney@gmail.comdatasyndrome.
>>> com
>>> 
>> 
>> 
>> 
>> --
>> *Note that I'm no longer using my Yahoo! email address. Please email me at
>> billgraham@gmail.com going forward.*
>> 


Re: Turn off Speculative Execution in a UDF?

Posted by Corbin Hoenes <co...@tynt.com>.
Hi all,

I am a little unclear about which speculative execution you must disable.
 What phase do storage functions run at map or reduce?

I've always just done both.
set mapred.reduce.tasks.speculative.execution false
set mapred.map.tasks.speculative.execution false

Thanks for any hints!

On Fri, Mar 2, 2012 at 4:18 PM, Bill Graham <bi...@gmail.com> wrote:

> I tried to test turning this off in the setStoreLocation method but without
> that change I wasn't able to get a job to run with SE happening. As a
> result I can't verify that the setting is doing anything. Russell, if you
> can reproduce SE I'd be curious to hear if you could turn it off in
> setStoreLocation.
>
> On Fri, Mar 2, 2012 at 2:40 PM, Russell Jurney <russell.jurney@gmail.com
> >wrote:
>
> > I thought it was too late in the workflow to do this, but it would be
> > really cool if you could.  I don't think to think about MapReduce much
> when
> > I Pig, except to group my scripts by jobs... so this was a surprise for
> me.
> >  Made sense once i thought of it.  But it was a surprise.
> >
> >
> > On Fri, Mar 2, 2012 at 1:49 PM, Bill Graham <bi...@gmail.com>
> wrote:
> >
> >> I was also curious about this and will try it, but my initial thought
> was
> >> that at that point it might be tool late in the workflow of the job.
> I'll
> >> give it a shot and report back.
> >>
> >>
> >> On Fri, Mar 2, 2012 at 1:45 PM, Dmitriy Ryaboy <dv...@gmail.com>
> >> wrote:
> >>
> >> > In a StoreFunc, you could do that when you get passed the jobconf,
> >> right?
> >> >
> >> > On Thu, Mar 1, 2012 at 9:37 PM, Bill Graham <bi...@gmail.com>
> >> wrote:
> >> > > I don't think so. We just do it in the pig script before using the
> >> store
> >> > > func:
> >> > >
> >> > > SET mapred.map.tasks.speculative.execution false
> >> > >
> >> > >
> >> > > On Thu, Mar 1, 2012 at 4:19 PM, Russell Jurney <
> >> russell.jurney@gmail.com
> >> > >wrote:
> >> > >
> >> > >> Can you turn off speculative execution in a StoreFunc?  I believe
> it
> >> is
> >> > >> leading to duplicates in MongoStorage();
> >> > >>
> >> > >> --
> >> > >> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> >> > >> datasyndrome.com
> >> > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > *Note that I'm no longer using my Yahoo! email address. Please email
> >> me
> >> > at
> >> > > billgraham@gmail.com going forward.*
> >> >
> >>
> >>
> >>
> >> --
> >> *Note that I'm no longer using my Yahoo! email address. Please email me
> at
> >> billgraham@gmail.com going forward.*
> >>
> >
> >
> >
> > --
> > Russell Jurney twitter.com/rjurney russell.jurney@gmail.comdatasyndrome.
> > com
> >
>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> billgraham@gmail.com going forward.*
>

Re: Turn off Speculative Execution in a UDF?

Posted by Bill Graham <bi...@gmail.com>.
I tried to test turning this off in the setStoreLocation method but without
that change I wasn't able to get a job to run with SE happening. As a
result I can't verify that the setting is doing anything. Russell, if you
can reproduce SE I'd be curious to hear if you could turn it off in
setStoreLocation.

On Fri, Mar 2, 2012 at 2:40 PM, Russell Jurney <ru...@gmail.com>wrote:

> I thought it was too late in the workflow to do this, but it would be
> really cool if you could.  I don't think to think about MapReduce much when
> I Pig, except to group my scripts by jobs... so this was a surprise for me.
>  Made sense once i thought of it.  But it was a surprise.
>
>
> On Fri, Mar 2, 2012 at 1:49 PM, Bill Graham <bi...@gmail.com> wrote:
>
>> I was also curious about this and will try it, but my initial thought was
>> that at that point it might be tool late in the workflow of the job. I'll
>> give it a shot and report back.
>>
>>
>> On Fri, Mar 2, 2012 at 1:45 PM, Dmitriy Ryaboy <dv...@gmail.com>
>> wrote:
>>
>> > In a StoreFunc, you could do that when you get passed the jobconf,
>> right?
>> >
>> > On Thu, Mar 1, 2012 at 9:37 PM, Bill Graham <bi...@gmail.com>
>> wrote:
>> > > I don't think so. We just do it in the pig script before using the
>> store
>> > > func:
>> > >
>> > > SET mapred.map.tasks.speculative.execution false
>> > >
>> > >
>> > > On Thu, Mar 1, 2012 at 4:19 PM, Russell Jurney <
>> russell.jurney@gmail.com
>> > >wrote:
>> > >
>> > >> Can you turn off speculative execution in a StoreFunc?  I believe it
>> is
>> > >> leading to duplicates in MongoStorage();
>> > >>
>> > >> --
>> > >> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
>> > >> datasyndrome.com
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > *Note that I'm no longer using my Yahoo! email address. Please email
>> me
>> > at
>> > > billgraham@gmail.com going forward.*
>> >
>>
>>
>>
>> --
>> *Note that I'm no longer using my Yahoo! email address. Please email me at
>> billgraham@gmail.com going forward.*
>>
>
>
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.
> com
>



-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgraham@gmail.com going forward.*

Re: Turn off Speculative Execution in a UDF?

Posted by Russell Jurney <ru...@gmail.com>.
I thought it was too late in the workflow to do this, but it would be
really cool if you could.  I don't think to think about MapReduce much when
I Pig, except to group my scripts by jobs... so this was a surprise for me.
 Made sense once i thought of it.  But it was a surprise.

On Fri, Mar 2, 2012 at 1:49 PM, Bill Graham <bi...@gmail.com> wrote:

> I was also curious about this and will try it, but my initial thought was
> that at that point it might be tool late in the workflow of the job. I'll
> give it a shot and report back.
>
>
> On Fri, Mar 2, 2012 at 1:45 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
>
> > In a StoreFunc, you could do that when you get passed the jobconf, right?
> >
> > On Thu, Mar 1, 2012 at 9:37 PM, Bill Graham <bi...@gmail.com>
> wrote:
> > > I don't think so. We just do it in the pig script before using the
> store
> > > func:
> > >
> > > SET mapred.map.tasks.speculative.execution false
> > >
> > >
> > > On Thu, Mar 1, 2012 at 4:19 PM, Russell Jurney <
> russell.jurney@gmail.com
> > >wrote:
> > >
> > >> Can you turn off speculative execution in a StoreFunc?  I believe it
> is
> > >> leading to duplicates in MongoStorage();
> > >>
> > >> --
> > >> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> > >> datasyndrome.com
> > >>
> > >
> > >
> > >
> > > --
> > > *Note that I'm no longer using my Yahoo! email address. Please email me
> > at
> > > billgraham@gmail.com going forward.*
> >
>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> billgraham@gmail.com going forward.*
>



-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com datasyndrome.com

Re: Turn off Speculative Execution in a UDF?

Posted by Bill Graham <bi...@gmail.com>.
I was also curious about this and will try it, but my initial thought was
that at that point it might be tool late in the workflow of the job. I'll
give it a shot and report back.


On Fri, Mar 2, 2012 at 1:45 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> In a StoreFunc, you could do that when you get passed the jobconf, right?
>
> On Thu, Mar 1, 2012 at 9:37 PM, Bill Graham <bi...@gmail.com> wrote:
> > I don't think so. We just do it in the pig script before using the store
> > func:
> >
> > SET mapred.map.tasks.speculative.execution false
> >
> >
> > On Thu, Mar 1, 2012 at 4:19 PM, Russell Jurney <russell.jurney@gmail.com
> >wrote:
> >
> >> Can you turn off speculative execution in a StoreFunc?  I believe it is
> >> leading to duplicates in MongoStorage();
> >>
> >> --
> >> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> >> datasyndrome.com
> >>
> >
> >
> >
> > --
> > *Note that I'm no longer using my Yahoo! email address. Please email me
> at
> > billgraham@gmail.com going forward.*
>



-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgraham@gmail.com going forward.*

Re: Turn off Speculative Execution in a UDF?

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
In a StoreFunc, you could do that when you get passed the jobconf, right?

On Thu, Mar 1, 2012 at 9:37 PM, Bill Graham <bi...@gmail.com> wrote:
> I don't think so. We just do it in the pig script before using the store
> func:
>
> SET mapred.map.tasks.speculative.execution false
>
>
> On Thu, Mar 1, 2012 at 4:19 PM, Russell Jurney <ru...@gmail.com>wrote:
>
>> Can you turn off speculative execution in a StoreFunc?  I believe it is
>> leading to duplicates in MongoStorage();
>>
>> --
>> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
>> datasyndrome.com
>>
>
>
>
> --
> *Note that I'm no longer using my Yahoo! email address. Please email me at
> billgraham@gmail.com going forward.*

Re: Turn off Speculative Execution in a UDF?

Posted by Bill Graham <bi...@gmail.com>.
I don't think so. We just do it in the pig script before using the store
func:

SET mapred.map.tasks.speculative.execution false


On Thu, Mar 1, 2012 at 4:19 PM, Russell Jurney <ru...@gmail.com>wrote:

> Can you turn off speculative execution in a StoreFunc?  I believe it is
> leading to duplicates in MongoStorage();
>
> --
> Russell Jurney twitter.com/rjurney russell.jurney@gmail.com
> datasyndrome.com
>



-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
billgraham@gmail.com going forward.*