You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Olga Natkovich <ol...@yahoo-inc.com> on 2011/02/03 19:42:28 UTC

REMINDER: Pig developer meeting in February

Hi guys,

This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus.

If you have not yet responded but planning to attend, please, let me know.

Olga

-----Original Message-----
From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com] 
Sent: Friday, January 28, 2011 3:36 PM
To: dev@pig.apache.org
Subject: RE: Pig developer meeting in February

I am planning to attend. 

-----Original Message-----
From: Olga Natkovich [mailto:olgan@yahoo-inc.com] 
Sent: Friday, January 28, 2011 12:58 PM
To: dev@pig.apache.org
Subject: RE: Pig developer meeting in February

I believe we have critical mass so the meeting is on!

If you have not responded yet but planning to attend, please, let me know.

Thanks,

Olga

-----Original Message-----
From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
Sent: Thursday, January 27, 2011 5:21 PM
To: dev@pig.apache.org
Subject: Re: Pig developer meeting in February

Me too.
Julien


On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:

Ok yeah I'll come :).



On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:

> While there is a lively discussion on this thread, I have not actually 
> gotten any responses to having the meeting with exception of 1 person :).
>
> Please, let me know by the end of the week if you are planning to attend.
> If we don't get at least a few more responses I suggest we postpone 
> the meeting.
>
> Thanks,
>
> Olga
>
> -----Original Message-----
> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> Sent: Wednesday, January 26, 2011 6:04 PM
> To: dev@pig.apache.org
> Subject: Re: Pig developer meeting in February
>
> Right, we do partition filtering, but not true predicate pushdown.
>
> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <ji...@yahoo-inc.com>
> wrote:
>
> > Are you talking about LoadMetadata.setPartitionFilter?
> > PartitionFilterOptimizer will do that.
> >
> > Daniel
> >
> >
> > Dmitriy Ryaboy wrote:
> >
> >> I may be wrong but I think predicate pushdown is designed for, but 
> >> not actually implemented in the current LoadPushdown interface (you 
> >> can only push projections). If I am wrong, that's great.. but if 
> >> not, that would
> be
> >> an important feature to add, as people are trying to connect Pig to 
> >> "smart"
> >> storage systems like rdbmses, HBase, and Cassandra more and more.  
> >> I
> think
> >> we only kind of simulate this with partition keys info, which is 
> >> not always sufficient
> >>
> >> D
> >>
> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem 
> >> <le...@yahoo-inc.com>
> >> wrote:
> >>
> >>
> >>
> >>> If making Pig Thread safe (i.e.: two threads running a different 
> >>> pig
> >>> script) is important then we need to change some of the APIs from
> static
> >>> singleton access to a dependency injection pattern.
> >>> In that case, this should probably be done before 1.0 For example: 
> >>> UDFContext should be passed to the UDF after construction (similar 
> >>> to the SevrletContext in Servlet or the way Hadoop passes the 
> >>> context to tasks) Also a clearly separated API that does not 
> >>> depend on the Pig implementation would help.
> >>> For example UDFContext is in org.apache.pig.impl.util when it 
> >>> would be better in org.apache.pig.api (Or at least an interface 
> >>> defining it)
> >>>
> >>> Julien
> >>>
> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com> wrote:
> >>>
> >>> Hi Guys,
> >>>
> >>> I think it is time for us to have another meeting. Yahoo would be 
> >>> happy to host if this works for everybody. How about Wednesday, 
> >>> 2/9 4-6 pm.
> >>> Please,
> >>> let us know if you are planning to attend and if the date/time 
> >>> works
> for
> >>> you.
> >>>
> >>> Things that come to mind to discuss and as always feel free to 
> >>> suggest
> >>> others:
> >>>
> >>> -          Error handling proposal - this might be easier to finalize
> >>> face-to-face
> >>> -          Pig 0.9 plan
> >>> -          Pig Roadmap beyond 0.9
> >>> o        What do we want to do in Pig.next?
> >>> o        Are we ready for Pig 1.0
> >>>
> >>> Olga
> >>>
> >>>
> >>>
> >>>
> >>
> >
>


Re: REMINDER: Pig developer meeting in February

Posted by Ashutosh Chauhan <ha...@apache.org>.
I'll be there.

Ashutosh
On Thu, Feb 3, 2011 at 11:24, Benjamin Reed <br...@yahoo-inc.com> wrote:
> i'll be there.
>
> ben
>
> On 02/03/2011 10:42 AM, Olga Natkovich wrote:
>>
>> Hi guys,
>>
>> This is just a reminder that the meeting will be held next Wednesday, 2/9
>> 4-6 pm at Yahoo campus.
>>
>> If you have not yet responded but planning to attend, please, let me know.
>>
>> Olga
>>
>> -----Original Message-----
>> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
>> Sent: Friday, January 28, 2011 3:36 PM
>> To: dev@pig.apache.org
>> Subject: RE: Pig developer meeting in February
>>
>> I am planning to attend.
>>
>> -----Original Message-----
>> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
>> Sent: Friday, January 28, 2011 12:58 PM
>> To: dev@pig.apache.org
>> Subject: RE: Pig developer meeting in February
>>
>> I believe we have critical mass so the meeting is on!
>>
>> If you have not responded yet but planning to attend, please, let me know.
>>
>> Thanks,
>>
>> Olga
>>
>> -----Original Message-----
>> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
>> Sent: Thursday, January 27, 2011 5:21 PM
>> To: dev@pig.apache.org
>> Subject: Re: Pig developer meeting in February
>>
>> Me too.
>> Julien
>>
>>
>> On 1/27/11 4:09 PM, "Dmitriy Ryaboy"<dv...@gmail.com>  wrote:
>>
>> Ok yeah I'll come :).
>>
>>
>>
>> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich<ol...@yahoo-inc.com>
>>  wrote:
>>
>>> While there is a lively discussion on this thread, I have not actually
>>> gotten any responses to having the meeting with exception of 1 person :).
>>>
>>> Please, let me know by the end of the week if you are planning to attend.
>>> If we don't get at least a few more responses I suggest we postpone
>>> the meeting.
>>>
>>> Thanks,
>>>
>>> Olga
>>>
>>> -----Original Message-----
>>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
>>> Sent: Wednesday, January 26, 2011 6:04 PM
>>> To: dev@pig.apache.org
>>> Subject: Re: Pig developer meeting in February
>>>
>>> Right, we do partition filtering, but not true predicate pushdown.
>>>
>>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai<ji...@yahoo-inc.com>
>>> wrote:
>>>
>>>> Are you talking about LoadMetadata.setPartitionFilter?
>>>> PartitionFilterOptimizer will do that.
>>>>
>>>> Daniel
>>>>
>>>>
>>>> Dmitriy Ryaboy wrote:
>>>>
>>>>> I may be wrong but I think predicate pushdown is designed for, but
>>>>> not actually implemented in the current LoadPushdown interface (you
>>>>> can only push projections). If I am wrong, that's great.. but if
>>>>> not, that would
>>>
>>> be
>>>>>
>>>>> an important feature to add, as people are trying to connect Pig to
>>>>> "smart"
>>>>> storage systems like rdbmses, HBase, and Cassandra more and more.
>>>>> I
>>>
>>> think
>>>>>
>>>>> we only kind of simulate this with partition keys info, which is
>>>>> not always sufficient
>>>>>
>>>>> D
>>>>>
>>>>> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
>>>>> <le...@yahoo-inc.com>
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> If making Pig Thread safe (i.e.: two threads running a different
>>>>>> pig
>>>>>> script) is important then we need to change some of the APIs from
>>>
>>> static
>>>>>>
>>>>>> singleton access to a dependency injection pattern.
>>>>>> In that case, this should probably be done before 1.0 For example:
>>>>>> UDFContext should be passed to the UDF after construction (similar
>>>>>> to the SevrletContext in Servlet or the way Hadoop passes the
>>>>>> context to tasks) Also a clearly separated API that does not
>>>>>> depend on the Pig implementation would help.
>>>>>> For example UDFContext is in org.apache.pig.impl.util when it
>>>>>> would be better in org.apache.pig.api (Or at least an interface
>>>>>> defining it)
>>>>>>
>>>>>> Julien
>>>>>>
>>>>>> On 1/24/11 10:14 AM, "Olga Natkovich"<ol...@yahoo-inc.com>  wrote:
>>>>>>
>>>>>> Hi Guys,
>>>>>>
>>>>>> I think it is time for us to have another meeting. Yahoo would be
>>>>>> happy to host if this works for everybody. How about Wednesday,
>>>>>> 2/9 4-6 pm.
>>>>>> Please,
>>>>>> let us know if you are planning to attend and if the date/time
>>>>>> works
>>>
>>> for
>>>>>>
>>>>>> you.
>>>>>>
>>>>>> Things that come to mind to discuss and as always feel free to
>>>>>> suggest
>>>>>> others:
>>>>>>
>>>>>> -          Error handling proposal - this might be easier to finalize
>>>>>> face-to-face
>>>>>> -          Pig 0.9 plan
>>>>>> -          Pig Roadmap beyond 0.9
>>>>>> o        What do we want to do in Pig.next?
>>>>>> o        Are we ready for Pig 1.0
>>>>>>
>>>>>> Olga
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>
>

Re: REMINDER: Pig developer meeting in February

Posted by Benjamin Reed <br...@yahoo-inc.com>.
i'll be there.

ben

On 02/03/2011 10:42 AM, Olga Natkovich wrote:
> Hi guys,
>
> This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus.
>
> If you have not yet responded but planning to attend, please, let me know.
>
> Olga
>
> -----Original Message-----
> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
> Sent: Friday, January 28, 2011 3:36 PM
> To: dev@pig.apache.org
> Subject: RE: Pig developer meeting in February
>
> I am planning to attend.
>
> -----Original Message-----
> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> Sent: Friday, January 28, 2011 12:58 PM
> To: dev@pig.apache.org
> Subject: RE: Pig developer meeting in February
>
> I believe we have critical mass so the meeting is on!
>
> If you have not responded yet but planning to attend, please, let me know.
>
> Thanks,
>
> Olga
>
> -----Original Message-----
> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
> Sent: Thursday, January 27, 2011 5:21 PM
> To: dev@pig.apache.org
> Subject: Re: Pig developer meeting in February
>
> Me too.
> Julien
>
>
> On 1/27/11 4:09 PM, "Dmitriy Ryaboy"<dv...@gmail.com>  wrote:
>
> Ok yeah I'll come :).
>
>
>
> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich<ol...@yahoo-inc.com>  wrote:
>
>> While there is a lively discussion on this thread, I have not actually
>> gotten any responses to having the meeting with exception of 1 person :).
>>
>> Please, let me know by the end of the week if you are planning to attend.
>> If we don't get at least a few more responses I suggest we postpone
>> the meeting.
>>
>> Thanks,
>>
>> Olga
>>
>> -----Original Message-----
>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
>> Sent: Wednesday, January 26, 2011 6:04 PM
>> To: dev@pig.apache.org
>> Subject: Re: Pig developer meeting in February
>>
>> Right, we do partition filtering, but not true predicate pushdown.
>>
>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai<ji...@yahoo-inc.com>
>> wrote:
>>
>>> Are you talking about LoadMetadata.setPartitionFilter?
>>> PartitionFilterOptimizer will do that.
>>>
>>> Daniel
>>>
>>>
>>> Dmitriy Ryaboy wrote:
>>>
>>>> I may be wrong but I think predicate pushdown is designed for, but
>>>> not actually implemented in the current LoadPushdown interface (you
>>>> can only push projections). If I am wrong, that's great.. but if
>>>> not, that would
>> be
>>>> an important feature to add, as people are trying to connect Pig to
>>>> "smart"
>>>> storage systems like rdbmses, HBase, and Cassandra more and more.
>>>> I
>> think
>>>> we only kind of simulate this with partition keys info, which is
>>>> not always sufficient
>>>>
>>>> D
>>>>
>>>> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
>>>> <le...@yahoo-inc.com>
>>>> wrote:
>>>>
>>>>
>>>>
>>>>> If making Pig Thread safe (i.e.: two threads running a different
>>>>> pig
>>>>> script) is important then we need to change some of the APIs from
>> static
>>>>> singleton access to a dependency injection pattern.
>>>>> In that case, this should probably be done before 1.0 For example:
>>>>> UDFContext should be passed to the UDF after construction (similar
>>>>> to the SevrletContext in Servlet or the way Hadoop passes the
>>>>> context to tasks) Also a clearly separated API that does not
>>>>> depend on the Pig implementation would help.
>>>>> For example UDFContext is in org.apache.pig.impl.util when it
>>>>> would be better in org.apache.pig.api (Or at least an interface
>>>>> defining it)
>>>>>
>>>>> Julien
>>>>>
>>>>> On 1/24/11 10:14 AM, "Olga Natkovich"<ol...@yahoo-inc.com>  wrote:
>>>>>
>>>>> Hi Guys,
>>>>>
>>>>> I think it is time for us to have another meeting. Yahoo would be
>>>>> happy to host if this works for everybody. How about Wednesday,
>>>>> 2/9 4-6 pm.
>>>>> Please,
>>>>> let us know if you are planning to attend and if the date/time
>>>>> works
>> for
>>>>> you.
>>>>>
>>>>> Things that come to mind to discuss and as always feel free to
>>>>> suggest
>>>>> others:
>>>>>
>>>>> -          Error handling proposal - this might be easier to finalize
>>>>> face-to-face
>>>>> -          Pig 0.9 plan
>>>>> -          Pig Roadmap beyond 0.9
>>>>> o        What do we want to do in Pig.next?
>>>>> o        Are we ready for Pig 1.0
>>>>>
>>>>> Olga
>>>>>
>>>>>
>>>>>
>>>>>


Re: REMINDER: Pig developer meeting in February

Posted by Renato Marroquín Mogrovejo <re...@gmail.com>.
Hey, there are slides from Chris Olston's talk.

http://infolab.stanford.edu/infoseminar/olston.txt
http://infolab.stanford.edu/infoseminar/olston-slides.pdf

But more formal documentation about Penny/InspectorGadget (cool name btw)
would be awesome.


Renato M.

2011/2/14 Olga Natkovich <ol...@yahoo-inc.com>

> We do not yet have anything public about Penny yet - still trying to figure
> out when/if it is going out. Don't think there is whole lot of interaction
> with the error handling proposal but I will let Alan to comment on that.
>
> Given that the error handling proposal is still not finalized and 0.9
> already has lots of changes and little time left, I would suggest delaying
> it to the release after 0.9.
>
> Olga
>
> -----Original Message-----
> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> Sent: Monday, February 14, 2011 3:49 PM
> To: dev@pig.apache.org
> Subject: Re: REMINDER: Pig developer meeting in February
>
> Thanks for that, arvind.
>
> Y! folks, is there any public documentation for Penny?
> Is there overlap there with the error handling proposal?
>
> Also: think error handling can make it into 0.9 or are we thinking 0.10?
>
> D
>
> On Mon, Feb 14, 2011 at 12:55 PM, arvind@cloudera.com
> <ar...@cloudera.com>wrote:
>
> > Hi,
> >
> > Sorry for the delay in sending this. Following are the notes from the
> last
> > developer's meeting.
> >
> > Arvind
> > -----------
> > *Attendees*
> >
> >   - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben,
> Julian
> >   - From Cloudera: Arvind
> >
> > *Agenda*
> >
> >   - Error Handling
> >   - Brainstorming Ideas For 0.9
> >   - Brainstorming Ideas Beyond 0.9
> >
> > *Error Handling Suggestions/Proposal Discussion:*
> >
> >   - Allow each statement to declare ONERROR clause with a UDF to handle
> the
> >   control in case of error.
> >      - This would be better than current behavior of exiting on error.
> >   - Alternatively, allow ONERROR to be declared for an entire
> >   script/session which would allow individual statements to override and
> >   provide a more specialized UDF for error handling.
> >   - Yet another alternative - allow the specification of a threshold
> number
> >   of errors that Pig ignores before exiting.
> >   - Key idea is to ensure that the error handling is focused on data
> error
> >   handling and not control-flow.
> >   - Action Item: Post the key proposal on the Wiki.
> >
> > *Brainstorming Ideas For 0.9:*
> >
> >   - Internal development done by March
> >   - Release tentatively by May
> >   - Support for ILLUSTRATE.
> >   - Current status:
> >      - Parser rewrite almost complete
> >      - Working on load data according to schema - support for padding
> >      missing values
> >      - No support for Boolean type planned yet.
> >   - Big features in 0.9
> >      - Parser change
> >      - Macro support
> >      - Jython/Script support
> >      - Penny (Formally Inspector Gadget): framework to instrument
> scripts.
> >      Allows detection of bad records that cause failures, implement
> > constraints.
> >         - Works by integrating with the optimizer to produce wrappers for
> >         key UDFs of interest.
> >         - Agents can be added in different parts of the query
> >         - Prepackaged agents available, but framework allows the creation
> >         of custom agents as needed.
> >         - Pending work - implementation of unit tests, and turning this
> >         into a patch.
> >
> > *Brainstorming Ideas Beyond 0.9:*
> >
> >   - Support for different backends for Pig (MR, Piranha, Local, Oozie)
> >      - Execution engine that can generate plans specific to the
> underlying
> >      architecture and allow controlling routines to
> > rewrite/re-optimize the plan
> >      mid-execution.
> >   - Thread safety when running local jobs - to allow better embedding of
> >   Pig as a light-weight tool in web-applications and other multi-threaded
> >   environments.
> >      - Work includes making UDF context thread-safe and removing statics
> >      from the implementation.
> >      - Will benefit Oozie and other systems that embed Pig without having
> >      to worry about side-effects.
> >   - Allow execution to resume from where it left off after due to runtime
> >   failure.
> >      - May be done by allowing Oozie as a backend where the plan is
> >      converted into an Oozie workflow.
> >      - Alternatively Pig could delegate blocks of execution to Oozie.
> >   - Scalability: Pig should support users who may not know the intricate
> >   details of the job/architecture. Things such as memory allocation, skew
> >   handling etc automatically without user involvement.
> >   - Allow pig to kill jobs already submitted if the shell exits due to a
> >   Control+C or other failures.
> >   - UDF 2.0 - simplify UDF interfaces, along with support for multiple
> >   versions of the UDF at the same time.
> >
> >
> > *General*
> >
> >   - Loops in Pig: No direct support, but available indirectly by
> >   integration with scripting environments.
> >   - Would be good to allow Pig to be provisioned across the cluster for
> >   faster job startup.
> >   - Pig-pen: not under active development and not supported.
> >
> >
> > On Fri, Feb 11, 2011 at 6:30 PM, Santhosh Srinivasan <sms@yahoo-inc.com
> > >wrote:
> >
> > > Arvind from Cloudera took excellent notes. You should see it next week
> > > after Alan gets a chance to review them.
> > >
> > > Santhosh
> > >
> > > -----Original Message-----
> > > From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> > > Sent: Friday, February 11, 2011 5:34 PM
> > > To: dev@pig.apache.org
> > > Subject: Re: REMINDER: Pig developer meeting in February
> > >
> > > Hi folks,
> > > Any chance someone took notes? :)
> > >
> > > D
> > >
> > > On Tue, Feb 8, 2011 at 9:38 PM, Dmitriy Ryaboy <dv...@gmail.com>
> > wrote:
> > > > Hi All,
> > > > I got sick and won't be able to make it. Would love to see some notes
> > > > after the meeting :).
> > > >
> > > > D
> > > >
> > > > On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <olgan@yahoo-inc.com
> >
> > > wrote:
> > > >> Hi Guys,
> > > >>
> > > >> We are looking forward to see you tomorrow at 4 pm at Yahoo campus
> in
> > > Sunnyvale.
> > > >>
> > > >> Yahoo address is
> > > >>
> > > >> 701 First Ave.
> > > >> Sunnyvale, CA 94089
> > > >>
> > > >> We are in building E. Please, ask for Alan or me at the reception.
> > > >>
> > > >> Olga
> > > >>
> > > >> -----Original Message-----
> > > >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> > > >> Sent: Thursday, February 03, 2011 10:42 AM
> > > >> To: dev@pig.apache.org
> > > >> Subject: REMINDER: Pig developer meeting in February
> > > >>
> > > >> Hi guys,
> > > >>
> > > >> This is just a reminder that the meeting will be held next
> Wednesday,
> > > 2/9 4-6 pm at Yahoo campus.
> > > >>
> > > >> If you have not yet responded but planning to attend, please, let me
> > > know.
> > > >>
> > > >> Olga
> > > >>
> > > >> -----Original Message-----
> > > >> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
> > > >> Sent: Friday, January 28, 2011 3:36 PM
> > > >> To: dev@pig.apache.org
> > > >> Subject: RE: Pig developer meeting in February
> > > >>
> > > >> I am planning to attend.
> > > >>
> > > >> -----Original Message-----
> > > >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> > > >> Sent: Friday, January 28, 2011 12:58 PM
> > > >> To: dev@pig.apache.org
> > > >> Subject: RE: Pig developer meeting in February
> > > >>
> > > >> I believe we have critical mass so the meeting is on!
> > > >>
> > > >> If you have not responded yet but planning to attend, please, let me
> > > know.
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Olga
> > > >>
> > > >> -----Original Message-----
> > > >> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
> > > >> Sent: Thursday, January 27, 2011 5:21 PM
> > > >> To: dev@pig.apache.org
> > > >> Subject: Re: Pig developer meeting in February
> > > >>
> > > >> Me too.
> > > >> Julien
> > > >>
> > > >>
> > > >> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:
> > > >>
> > > >> Ok yeah I'll come :).
> > > >>
> > > >>
> > > >>
> > > >> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <
> olgan@yahoo-inc.com>
> > > wrote:
> > > >>
> > > >>> While there is a lively discussion on this thread, I have not
> > > >>> actually gotten any responses to having the meeting with exception
> of
> > 1
> > > person :).
> > > >>>
> > > >>> Please, let me know by the end of the week if you are planning to
> > > attend.
> > > >>> If we don't get at least a few more responses I suggest we postpone
> > > >>> the meeting.
> > > >>>
> > > >>> Thanks,
> > > >>>
> > > >>> Olga
> > > >>>
> > > >>> -----Original Message-----
> > > >>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> > > >>> Sent: Wednesday, January 26, 2011 6:04 PM
> > > >>> To: dev@pig.apache.org
> > > >>> Subject: Re: Pig developer meeting in February
> > > >>>
> > > >>> Right, we do partition filtering, but not true predicate pushdown.
> > > >>>
> > > >>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <
> jianyong@yahoo-inc.com>
> > > >>> wrote:
> > > >>>
> > > >>> > Are you talking about LoadMetadata.setPartitionFilter?
> > > >>> > PartitionFilterOptimizer will do that.
> > > >>> >
> > > >>> > Daniel
> > > >>> >
> > > >>> >
> > > >>> > Dmitriy Ryaboy wrote:
> > > >>> >
> > > >>> >> I may be wrong but I think predicate pushdown is designed for,
> > > >>> >> but not actually implemented in the current LoadPushdown
> > > >>> >> interface (you can only push projections). If I am wrong, that's
> > > >>> >> great.. but if not, that would
> > > >>> be
> > > >>> >> an important feature to add, as people are trying to connect Pig
> > > >>> >> to "smart"
> > > >>> >> storage systems like rdbmses, HBase, and Cassandra more and
> more.
> > > >>> >> I
> > > >>> think
> > > >>> >> we only kind of simulate this with partition keys info, which is
> > > >>> >> not always sufficient
> > > >>> >>
> > > >>> >> D
> > > >>> >>
> > > >>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
> > > >>> >> <le...@yahoo-inc.com>
> > > >>> >> wrote:
> > > >>> >>
> > > >>> >>
> > > >>> >>
> > > >>> >>> If making Pig Thread safe (i.e.: two threads running a
> different
> > > >>> >>> pig
> > > >>> >>> script) is important then we need to change some of the APIs
> > > >>> >>> from
> > > >>> static
> > > >>> >>> singleton access to a dependency injection pattern.
> > > >>> >>> In that case, this should probably be done before 1.0 For
> > example:
> > > >>> >>> UDFContext should be passed to the UDF after construction
> > > >>> >>> (similar to the SevrletContext in Servlet or the way Hadoop
> > > >>> >>> passes the context to tasks) Also a clearly separated API that
> > > >>> >>> does not depend on the Pig implementation would help.
> > > >>> >>> For example UDFContext is in org.apache.pig.impl.util when it
> > > >>> >>> would be better in org.apache.pig.api (Or at least an interface
> > > >>> >>> defining it)
> > > >>> >>>
> > > >>> >>> Julien
> > > >>> >>>
> > > >>> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com>
> > wrote:
> > > >>> >>>
> > > >>> >>> Hi Guys,
> > > >>> >>>
> > > >>> >>> I think it is time for us to have another meeting. Yahoo would
> > > >>> >>> be happy to host if this works for everybody. How about
> > > >>> >>> Wednesday,
> > > >>> >>> 2/9 4-6 pm.
> > > >>> >>> Please,
> > > >>> >>> let us know if you are planning to attend and if the date/time
> > > >>> >>> works
> > > >>> for
> > > >>> >>> you.
> > > >>> >>>
> > > >>> >>> Things that come to mind to discuss and as always feel free to
> > > >>> >>> suggest
> > > >>> >>> others:
> > > >>> >>>
> > > >>> >>> -          Error handling proposal - this might be easier to
> > > >>> >>> finalize face-to-face
> > > >>> >>> -          Pig 0.9 plan
> > > >>> >>> -          Pig Roadmap beyond 0.9 o        What do we want to
> do
> > > >>> >>> in Pig.next?
> > > >>> >>> o        Are we ready for Pig 1.0
> > > >>> >>>
> > > >>> >>> Olga
> > > >>> >>>
> > > >>> >>>
> > > >>> >>>
> > > >>> >>>
> > > >>> >>
> > > >>> >
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> >
>

RE: REMINDER: Pig developer meeting in February

Posted by Olga Natkovich <ol...@yahoo-inc.com>.
We do not yet have anything public about Penny yet - still trying to figure out when/if it is going out. Don't think there is whole lot of interaction with the error handling proposal but I will let Alan to comment on that.

Given that the error handling proposal is still not finalized and 0.9 already has lots of changes and little time left, I would suggest delaying it to the release after 0.9. 

Olga

-----Original Message-----
From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com] 
Sent: Monday, February 14, 2011 3:49 PM
To: dev@pig.apache.org
Subject: Re: REMINDER: Pig developer meeting in February

Thanks for that, arvind.

Y! folks, is there any public documentation for Penny?
Is there overlap there with the error handling proposal?

Also: think error handling can make it into 0.9 or are we thinking 0.10?

D

On Mon, Feb 14, 2011 at 12:55 PM, arvind@cloudera.com
<ar...@cloudera.com>wrote:

> Hi,
>
> Sorry for the delay in sending this. Following are the notes from the last
> developer's meeting.
>
> Arvind
> -----------
> *Attendees*
>
>   - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben, Julian
>   - From Cloudera: Arvind
>
> *Agenda*
>
>   - Error Handling
>   - Brainstorming Ideas For 0.9
>   - Brainstorming Ideas Beyond 0.9
>
> *Error Handling Suggestions/Proposal Discussion:*
>
>   - Allow each statement to declare ONERROR clause with a UDF to handle the
>   control in case of error.
>      - This would be better than current behavior of exiting on error.
>   - Alternatively, allow ONERROR to be declared for an entire
>   script/session which would allow individual statements to override and
>   provide a more specialized UDF for error handling.
>   - Yet another alternative - allow the specification of a threshold number
>   of errors that Pig ignores before exiting.
>   - Key idea is to ensure that the error handling is focused on data error
>   handling and not control-flow.
>   - Action Item: Post the key proposal on the Wiki.
>
> *Brainstorming Ideas For 0.9:*
>
>   - Internal development done by March
>   - Release tentatively by May
>   - Support for ILLUSTRATE.
>   - Current status:
>      - Parser rewrite almost complete
>      - Working on load data according to schema - support for padding
>      missing values
>      - No support for Boolean type planned yet.
>   - Big features in 0.9
>      - Parser change
>      - Macro support
>      - Jython/Script support
>      - Penny (Formally Inspector Gadget): framework to instrument scripts.
>      Allows detection of bad records that cause failures, implement
> constraints.
>         - Works by integrating with the optimizer to produce wrappers for
>         key UDFs of interest.
>         - Agents can be added in different parts of the query
>         - Prepackaged agents available, but framework allows the creation
>         of custom agents as needed.
>         - Pending work - implementation of unit tests, and turning this
>         into a patch.
>
> *Brainstorming Ideas Beyond 0.9:*
>
>   - Support for different backends for Pig (MR, Piranha, Local, Oozie)
>      - Execution engine that can generate plans specific to the underlying
>      architecture and allow controlling routines to
> rewrite/re-optimize the plan
>      mid-execution.
>   - Thread safety when running local jobs - to allow better embedding of
>   Pig as a light-weight tool in web-applications and other multi-threaded
>   environments.
>      - Work includes making UDF context thread-safe and removing statics
>      from the implementation.
>      - Will benefit Oozie and other systems that embed Pig without having
>      to worry about side-effects.
>   - Allow execution to resume from where it left off after due to runtime
>   failure.
>      - May be done by allowing Oozie as a backend where the plan is
>      converted into an Oozie workflow.
>      - Alternatively Pig could delegate blocks of execution to Oozie.
>   - Scalability: Pig should support users who may not know the intricate
>   details of the job/architecture. Things such as memory allocation, skew
>   handling etc automatically without user involvement.
>   - Allow pig to kill jobs already submitted if the shell exits due to a
>   Control+C or other failures.
>   - UDF 2.0 - simplify UDF interfaces, along with support for multiple
>   versions of the UDF at the same time.
>
>
> *General*
>
>   - Loops in Pig: No direct support, but available indirectly by
>   integration with scripting environments.
>   - Would be good to allow Pig to be provisioned across the cluster for
>   faster job startup.
>   - Pig-pen: not under active development and not supported.
>
>
> On Fri, Feb 11, 2011 at 6:30 PM, Santhosh Srinivasan <sms@yahoo-inc.com
> >wrote:
>
> > Arvind from Cloudera took excellent notes. You should see it next week
> > after Alan gets a chance to review them.
> >
> > Santhosh
> >
> > -----Original Message-----
> > From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> > Sent: Friday, February 11, 2011 5:34 PM
> > To: dev@pig.apache.org
> > Subject: Re: REMINDER: Pig developer meeting in February
> >
> > Hi folks,
> > Any chance someone took notes? :)
> >
> > D
> >
> > On Tue, Feb 8, 2011 at 9:38 PM, Dmitriy Ryaboy <dv...@gmail.com>
> wrote:
> > > Hi All,
> > > I got sick and won't be able to make it. Would love to see some notes
> > > after the meeting :).
> > >
> > > D
> > >
> > > On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <ol...@yahoo-inc.com>
> > wrote:
> > >> Hi Guys,
> > >>
> > >> We are looking forward to see you tomorrow at 4 pm at Yahoo campus in
> > Sunnyvale.
> > >>
> > >> Yahoo address is
> > >>
> > >> 701 First Ave.
> > >> Sunnyvale, CA 94089
> > >>
> > >> We are in building E. Please, ask for Alan or me at the reception.
> > >>
> > >> Olga
> > >>
> > >> -----Original Message-----
> > >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> > >> Sent: Thursday, February 03, 2011 10:42 AM
> > >> To: dev@pig.apache.org
> > >> Subject: REMINDER: Pig developer meeting in February
> > >>
> > >> Hi guys,
> > >>
> > >> This is just a reminder that the meeting will be held next Wednesday,
> > 2/9 4-6 pm at Yahoo campus.
> > >>
> > >> If you have not yet responded but planning to attend, please, let me
> > know.
> > >>
> > >> Olga
> > >>
> > >> -----Original Message-----
> > >> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
> > >> Sent: Friday, January 28, 2011 3:36 PM
> > >> To: dev@pig.apache.org
> > >> Subject: RE: Pig developer meeting in February
> > >>
> > >> I am planning to attend.
> > >>
> > >> -----Original Message-----
> > >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> > >> Sent: Friday, January 28, 2011 12:58 PM
> > >> To: dev@pig.apache.org
> > >> Subject: RE: Pig developer meeting in February
> > >>
> > >> I believe we have critical mass so the meeting is on!
> > >>
> > >> If you have not responded yet but planning to attend, please, let me
> > know.
> > >>
> > >> Thanks,
> > >>
> > >> Olga
> > >>
> > >> -----Original Message-----
> > >> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
> > >> Sent: Thursday, January 27, 2011 5:21 PM
> > >> To: dev@pig.apache.org
> > >> Subject: Re: Pig developer meeting in February
> > >>
> > >> Me too.
> > >> Julien
> > >>
> > >>
> > >> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:
> > >>
> > >> Ok yeah I'll come :).
> > >>
> > >>
> > >>
> > >> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <ol...@yahoo-inc.com>
> > wrote:
> > >>
> > >>> While there is a lively discussion on this thread, I have not
> > >>> actually gotten any responses to having the meeting with exception of
> 1
> > person :).
> > >>>
> > >>> Please, let me know by the end of the week if you are planning to
> > attend.
> > >>> If we don't get at least a few more responses I suggest we postpone
> > >>> the meeting.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Olga
> > >>>
> > >>> -----Original Message-----
> > >>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> > >>> Sent: Wednesday, January 26, 2011 6:04 PM
> > >>> To: dev@pig.apache.org
> > >>> Subject: Re: Pig developer meeting in February
> > >>>
> > >>> Right, we do partition filtering, but not true predicate pushdown.
> > >>>
> > >>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <ji...@yahoo-inc.com>
> > >>> wrote:
> > >>>
> > >>> > Are you talking about LoadMetadata.setPartitionFilter?
> > >>> > PartitionFilterOptimizer will do that.
> > >>> >
> > >>> > Daniel
> > >>> >
> > >>> >
> > >>> > Dmitriy Ryaboy wrote:
> > >>> >
> > >>> >> I may be wrong but I think predicate pushdown is designed for,
> > >>> >> but not actually implemented in the current LoadPushdown
> > >>> >> interface (you can only push projections). If I am wrong, that's
> > >>> >> great.. but if not, that would
> > >>> be
> > >>> >> an important feature to add, as people are trying to connect Pig
> > >>> >> to "smart"
> > >>> >> storage systems like rdbmses, HBase, and Cassandra more and more.
> > >>> >> I
> > >>> think
> > >>> >> we only kind of simulate this with partition keys info, which is
> > >>> >> not always sufficient
> > >>> >>
> > >>> >> D
> > >>> >>
> > >>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
> > >>> >> <le...@yahoo-inc.com>
> > >>> >> wrote:
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >>> If making Pig Thread safe (i.e.: two threads running a different
> > >>> >>> pig
> > >>> >>> script) is important then we need to change some of the APIs
> > >>> >>> from
> > >>> static
> > >>> >>> singleton access to a dependency injection pattern.
> > >>> >>> In that case, this should probably be done before 1.0 For
> example:
> > >>> >>> UDFContext should be passed to the UDF after construction
> > >>> >>> (similar to the SevrletContext in Servlet or the way Hadoop
> > >>> >>> passes the context to tasks) Also a clearly separated API that
> > >>> >>> does not depend on the Pig implementation would help.
> > >>> >>> For example UDFContext is in org.apache.pig.impl.util when it
> > >>> >>> would be better in org.apache.pig.api (Or at least an interface
> > >>> >>> defining it)
> > >>> >>>
> > >>> >>> Julien
> > >>> >>>
> > >>> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com>
> wrote:
> > >>> >>>
> > >>> >>> Hi Guys,
> > >>> >>>
> > >>> >>> I think it is time for us to have another meeting. Yahoo would
> > >>> >>> be happy to host if this works for everybody. How about
> > >>> >>> Wednesday,
> > >>> >>> 2/9 4-6 pm.
> > >>> >>> Please,
> > >>> >>> let us know if you are planning to attend and if the date/time
> > >>> >>> works
> > >>> for
> > >>> >>> you.
> > >>> >>>
> > >>> >>> Things that come to mind to discuss and as always feel free to
> > >>> >>> suggest
> > >>> >>> others:
> > >>> >>>
> > >>> >>> -          Error handling proposal - this might be easier to
> > >>> >>> finalize face-to-face
> > >>> >>> -          Pig 0.9 plan
> > >>> >>> -          Pig Roadmap beyond 0.9 o        What do we want to do
> > >>> >>> in Pig.next?
> > >>> >>> o        Are we ready for Pig 1.0
> > >>> >>>
> > >>> >>> Olga
> > >>> >>>
> > >>> >>>
> > >>> >>>
> > >>> >>>
> > >>> >>
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: REMINDER: Pig developer meeting in February

Posted by Ashutosh Chauhan <ha...@apache.org>.
There is a related work overlapping though with (slightly) different
goals and implementations:

http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper37.pdf
http://www.cidrdb.org/cidr2011/Talks/CIDR11_Ikeda.ppt

Ashutosh

On Mon, Feb 14, 2011 at 15:48, Dmitriy Ryaboy <dv...@gmail.com> wrote:
> Thanks for that, arvind.
>
> Y! folks, is there any public documentation for Penny?
> Is there overlap there with the error handling proposal?
>
> Also: think error handling can make it into 0.9 or are we thinking 0.10?
>
> D
>
> On Mon, Feb 14, 2011 at 12:55 PM, arvind@cloudera.com
> <ar...@cloudera.com>wrote:
>
>> Hi,
>>
>> Sorry for the delay in sending this. Following are the notes from the last
>> developer's meeting.
>>
>> Arvind
>> -----------
>> *Attendees*
>>
>>   - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben, Julian
>>   - From Cloudera: Arvind
>>
>> *Agenda*
>>
>>   - Error Handling
>>   - Brainstorming Ideas For 0.9
>>   - Brainstorming Ideas Beyond 0.9
>>
>> *Error Handling Suggestions/Proposal Discussion:*
>>
>>   - Allow each statement to declare ONERROR clause with a UDF to handle the
>>   control in case of error.
>>      - This would be better than current behavior of exiting on error.
>>   - Alternatively, allow ONERROR to be declared for an entire
>>   script/session which would allow individual statements to override and
>>   provide a more specialized UDF for error handling.
>>   - Yet another alternative - allow the specification of a threshold number
>>   of errors that Pig ignores before exiting.
>>   - Key idea is to ensure that the error handling is focused on data error
>>   handling and not control-flow.
>>   - Action Item: Post the key proposal on the Wiki.
>>
>> *Brainstorming Ideas For 0.9:*
>>
>>   - Internal development done by March
>>   - Release tentatively by May
>>   - Support for ILLUSTRATE.
>>   - Current status:
>>      - Parser rewrite almost complete
>>      - Working on load data according to schema - support for padding
>>      missing values
>>      - No support for Boolean type planned yet.
>>   - Big features in 0.9
>>      - Parser change
>>      - Macro support
>>      - Jython/Script support
>>      - Penny (Formally Inspector Gadget): framework to instrument scripts.
>>      Allows detection of bad records that cause failures, implement
>> constraints.
>>         - Works by integrating with the optimizer to produce wrappers for
>>         key UDFs of interest.
>>         - Agents can be added in different parts of the query
>>         - Prepackaged agents available, but framework allows the creation
>>         of custom agents as needed.
>>         - Pending work - implementation of unit tests, and turning this
>>         into a patch.
>>
>> *Brainstorming Ideas Beyond 0.9:*
>>
>>   - Support for different backends for Pig (MR, Piranha, Local, Oozie)
>>      - Execution engine that can generate plans specific to the underlying
>>      architecture and allow controlling routines to
>> rewrite/re-optimize the plan
>>      mid-execution.
>>   - Thread safety when running local jobs - to allow better embedding of
>>   Pig as a light-weight tool in web-applications and other multi-threaded
>>   environments.
>>      - Work includes making UDF context thread-safe and removing statics
>>      from the implementation.
>>      - Will benefit Oozie and other systems that embed Pig without having
>>      to worry about side-effects.
>>   - Allow execution to resume from where it left off after due to runtime
>>   failure.
>>      - May be done by allowing Oozie as a backend where the plan is
>>      converted into an Oozie workflow.
>>      - Alternatively Pig could delegate blocks of execution to Oozie.
>>   - Scalability: Pig should support users who may not know the intricate
>>   details of the job/architecture. Things such as memory allocation, skew
>>   handling etc automatically without user involvement.
>>   - Allow pig to kill jobs already submitted if the shell exits due to a
>>   Control+C or other failures.
>>   - UDF 2.0 - simplify UDF interfaces, along with support for multiple
>>   versions of the UDF at the same time.
>>
>>
>> *General*
>>
>>   - Loops in Pig: No direct support, but available indirectly by
>>   integration with scripting environments.
>>   - Would be good to allow Pig to be provisioned across the cluster for
>>   faster job startup.
>>   - Pig-pen: not under active development and not supported.
>>
>>
>> On Fri, Feb 11, 2011 at 6:30 PM, Santhosh Srinivasan <sms@yahoo-inc.com
>> >wrote:
>>
>> > Arvind from Cloudera took excellent notes. You should see it next week
>> > after Alan gets a chance to review them.
>> >
>> > Santhosh
>> >
>> > -----Original Message-----
>> > From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
>> > Sent: Friday, February 11, 2011 5:34 PM
>> > To: dev@pig.apache.org
>> > Subject: Re: REMINDER: Pig developer meeting in February
>> >
>> > Hi folks,
>> > Any chance someone took notes? :)
>> >
>> > D
>> >
>> > On Tue, Feb 8, 2011 at 9:38 PM, Dmitriy Ryaboy <dv...@gmail.com>
>> wrote:
>> > > Hi All,
>> > > I got sick and won't be able to make it. Would love to see some notes
>> > > after the meeting :).
>> > >
>> > > D
>> > >
>> > > On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <ol...@yahoo-inc.com>
>> > wrote:
>> > >> Hi Guys,
>> > >>
>> > >> We are looking forward to see you tomorrow at 4 pm at Yahoo campus in
>> > Sunnyvale.
>> > >>
>> > >> Yahoo address is
>> > >>
>> > >> 701 First Ave.
>> > >> Sunnyvale, CA 94089
>> > >>
>> > >> We are in building E. Please, ask for Alan or me at the reception.
>> > >>
>> > >> Olga
>> > >>
>> > >> -----Original Message-----
>> > >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
>> > >> Sent: Thursday, February 03, 2011 10:42 AM
>> > >> To: dev@pig.apache.org
>> > >> Subject: REMINDER: Pig developer meeting in February
>> > >>
>> > >> Hi guys,
>> > >>
>> > >> This is just a reminder that the meeting will be held next Wednesday,
>> > 2/9 4-6 pm at Yahoo campus.
>> > >>
>> > >> If you have not yet responded but planning to attend, please, let me
>> > know.
>> > >>
>> > >> Olga
>> > >>
>> > >> -----Original Message-----
>> > >> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
>> > >> Sent: Friday, January 28, 2011 3:36 PM
>> > >> To: dev@pig.apache.org
>> > >> Subject: RE: Pig developer meeting in February
>> > >>
>> > >> I am planning to attend.
>> > >>
>> > >> -----Original Message-----
>> > >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
>> > >> Sent: Friday, January 28, 2011 12:58 PM
>> > >> To: dev@pig.apache.org
>> > >> Subject: RE: Pig developer meeting in February
>> > >>
>> > >> I believe we have critical mass so the meeting is on!
>> > >>
>> > >> If you have not responded yet but planning to attend, please, let me
>> > know.
>> > >>
>> > >> Thanks,
>> > >>
>> > >> Olga
>> > >>
>> > >> -----Original Message-----
>> > >> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
>> > >> Sent: Thursday, January 27, 2011 5:21 PM
>> > >> To: dev@pig.apache.org
>> > >> Subject: Re: Pig developer meeting in February
>> > >>
>> > >> Me too.
>> > >> Julien
>> > >>
>> > >>
>> > >> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:
>> > >>
>> > >> Ok yeah I'll come :).
>> > >>
>> > >>
>> > >>
>> > >> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <ol...@yahoo-inc.com>
>> > wrote:
>> > >>
>> > >>> While there is a lively discussion on this thread, I have not
>> > >>> actually gotten any responses to having the meeting with exception of
>> 1
>> > person :).
>> > >>>
>> > >>> Please, let me know by the end of the week if you are planning to
>> > attend.
>> > >>> If we don't get at least a few more responses I suggest we postpone
>> > >>> the meeting.
>> > >>>
>> > >>> Thanks,
>> > >>>
>> > >>> Olga
>> > >>>
>> > >>> -----Original Message-----
>> > >>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
>> > >>> Sent: Wednesday, January 26, 2011 6:04 PM
>> > >>> To: dev@pig.apache.org
>> > >>> Subject: Re: Pig developer meeting in February
>> > >>>
>> > >>> Right, we do partition filtering, but not true predicate pushdown.
>> > >>>
>> > >>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <ji...@yahoo-inc.com>
>> > >>> wrote:
>> > >>>
>> > >>> > Are you talking about LoadMetadata.setPartitionFilter?
>> > >>> > PartitionFilterOptimizer will do that.
>> > >>> >
>> > >>> > Daniel
>> > >>> >
>> > >>> >
>> > >>> > Dmitriy Ryaboy wrote:
>> > >>> >
>> > >>> >> I may be wrong but I think predicate pushdown is designed for,
>> > >>> >> but not actually implemented in the current LoadPushdown
>> > >>> >> interface (you can only push projections). If I am wrong, that's
>> > >>> >> great.. but if not, that would
>> > >>> be
>> > >>> >> an important feature to add, as people are trying to connect Pig
>> > >>> >> to "smart"
>> > >>> >> storage systems like rdbmses, HBase, and Cassandra more and more.
>> > >>> >> I
>> > >>> think
>> > >>> >> we only kind of simulate this with partition keys info, which is
>> > >>> >> not always sufficient
>> > >>> >>
>> > >>> >> D
>> > >>> >>
>> > >>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
>> > >>> >> <le...@yahoo-inc.com>
>> > >>> >> wrote:
>> > >>> >>
>> > >>> >>
>> > >>> >>
>> > >>> >>> If making Pig Thread safe (i.e.: two threads running a different
>> > >>> >>> pig
>> > >>> >>> script) is important then we need to change some of the APIs
>> > >>> >>> from
>> > >>> static
>> > >>> >>> singleton access to a dependency injection pattern.
>> > >>> >>> In that case, this should probably be done before 1.0 For
>> example:
>> > >>> >>> UDFContext should be passed to the UDF after construction
>> > >>> >>> (similar to the SevrletContext in Servlet or the way Hadoop
>> > >>> >>> passes the context to tasks) Also a clearly separated API that
>> > >>> >>> does not depend on the Pig implementation would help.
>> > >>> >>> For example UDFContext is in org.apache.pig.impl.util when it
>> > >>> >>> would be better in org.apache.pig.api (Or at least an interface
>> > >>> >>> defining it)
>> > >>> >>>
>> > >>> >>> Julien
>> > >>> >>>
>> > >>> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com>
>> wrote:
>> > >>> >>>
>> > >>> >>> Hi Guys,
>> > >>> >>>
>> > >>> >>> I think it is time for us to have another meeting. Yahoo would
>> > >>> >>> be happy to host if this works for everybody. How about
>> > >>> >>> Wednesday,
>> > >>> >>> 2/9 4-6 pm.
>> > >>> >>> Please,
>> > >>> >>> let us know if you are planning to attend and if the date/time
>> > >>> >>> works
>> > >>> for
>> > >>> >>> you.
>> > >>> >>>
>> > >>> >>> Things that come to mind to discuss and as always feel free to
>> > >>> >>> suggest
>> > >>> >>> others:
>> > >>> >>>
>> > >>> >>> -          Error handling proposal - this might be easier to
>> > >>> >>> finalize face-to-face
>> > >>> >>> -          Pig 0.9 plan
>> > >>> >>> -          Pig Roadmap beyond 0.9 o        What do we want to do
>> > >>> >>> in Pig.next?
>> > >>> >>> o        Are we ready for Pig 1.0
>> > >>> >>>
>> > >>> >>> Olga
>> > >>> >>>
>> > >>> >>>
>> > >>> >>>
>> > >>> >>>
>> > >>> >>
>> > >>> >
>> > >>>
>> > >>
>> > >>
>> > >
>> >
>>
>

Re: REMINDER: Pig developer meeting in February

Posted by Alan Gates <ga...@yahoo-inc.com>.
On Feb 15, 2011, at 5:18 AM, Dmitriy Ryaboy wrote:

> Is there overlap there with the error handling proposal?

I don't think so.  The error handling proposal is about how to handle  
errors that happen when you are running Pig jobs.  Penny is a way to  
instrument your scripts so that you can do things like crash analysis,  
etc.  In that way it's similar to the interface that java makes  
available to tools like gcov for doing byte code insertion.   
Instrumenting code you're using in a production environment would be  
prohibitively expensive.

>
> Also: think error handling can make it into 0.9 or are we thinking  
> 0.10?

Given that no one is actively working on it and we're hoping to end  
feature development on 0.9 in the next month, I don't see how there  
will be time.

Alan.


Re: REMINDER: Pig developer meeting in February

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Thanks for that, arvind.

Y! folks, is there any public documentation for Penny?
Is there overlap there with the error handling proposal?

Also: think error handling can make it into 0.9 or are we thinking 0.10?

D

On Mon, Feb 14, 2011 at 12:55 PM, arvind@cloudera.com
<ar...@cloudera.com>wrote:

> Hi,
>
> Sorry for the delay in sending this. Following are the notes from the last
> developer's meeting.
>
> Arvind
> -----------
> *Attendees*
>
>   - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben, Julian
>   - From Cloudera: Arvind
>
> *Agenda*
>
>   - Error Handling
>   - Brainstorming Ideas For 0.9
>   - Brainstorming Ideas Beyond 0.9
>
> *Error Handling Suggestions/Proposal Discussion:*
>
>   - Allow each statement to declare ONERROR clause with a UDF to handle the
>   control in case of error.
>      - This would be better than current behavior of exiting on error.
>   - Alternatively, allow ONERROR to be declared for an entire
>   script/session which would allow individual statements to override and
>   provide a more specialized UDF for error handling.
>   - Yet another alternative - allow the specification of a threshold number
>   of errors that Pig ignores before exiting.
>   - Key idea is to ensure that the error handling is focused on data error
>   handling and not control-flow.
>   - Action Item: Post the key proposal on the Wiki.
>
> *Brainstorming Ideas For 0.9:*
>
>   - Internal development done by March
>   - Release tentatively by May
>   - Support for ILLUSTRATE.
>   - Current status:
>      - Parser rewrite almost complete
>      - Working on load data according to schema - support for padding
>      missing values
>      - No support for Boolean type planned yet.
>   - Big features in 0.9
>      - Parser change
>      - Macro support
>      - Jython/Script support
>      - Penny (Formally Inspector Gadget): framework to instrument scripts.
>      Allows detection of bad records that cause failures, implement
> constraints.
>         - Works by integrating with the optimizer to produce wrappers for
>         key UDFs of interest.
>         - Agents can be added in different parts of the query
>         - Prepackaged agents available, but framework allows the creation
>         of custom agents as needed.
>         - Pending work - implementation of unit tests, and turning this
>         into a patch.
>
> *Brainstorming Ideas Beyond 0.9:*
>
>   - Support for different backends for Pig (MR, Piranha, Local, Oozie)
>      - Execution engine that can generate plans specific to the underlying
>      architecture and allow controlling routines to
> rewrite/re-optimize the plan
>      mid-execution.
>   - Thread safety when running local jobs - to allow better embedding of
>   Pig as a light-weight tool in web-applications and other multi-threaded
>   environments.
>      - Work includes making UDF context thread-safe and removing statics
>      from the implementation.
>      - Will benefit Oozie and other systems that embed Pig without having
>      to worry about side-effects.
>   - Allow execution to resume from where it left off after due to runtime
>   failure.
>      - May be done by allowing Oozie as a backend where the plan is
>      converted into an Oozie workflow.
>      - Alternatively Pig could delegate blocks of execution to Oozie.
>   - Scalability: Pig should support users who may not know the intricate
>   details of the job/architecture. Things such as memory allocation, skew
>   handling etc automatically without user involvement.
>   - Allow pig to kill jobs already submitted if the shell exits due to a
>   Control+C or other failures.
>   - UDF 2.0 - simplify UDF interfaces, along with support for multiple
>   versions of the UDF at the same time.
>
>
> *General*
>
>   - Loops in Pig: No direct support, but available indirectly by
>   integration with scripting environments.
>   - Would be good to allow Pig to be provisioned across the cluster for
>   faster job startup.
>   - Pig-pen: not under active development and not supported.
>
>
> On Fri, Feb 11, 2011 at 6:30 PM, Santhosh Srinivasan <sms@yahoo-inc.com
> >wrote:
>
> > Arvind from Cloudera took excellent notes. You should see it next week
> > after Alan gets a chance to review them.
> >
> > Santhosh
> >
> > -----Original Message-----
> > From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> > Sent: Friday, February 11, 2011 5:34 PM
> > To: dev@pig.apache.org
> > Subject: Re: REMINDER: Pig developer meeting in February
> >
> > Hi folks,
> > Any chance someone took notes? :)
> >
> > D
> >
> > On Tue, Feb 8, 2011 at 9:38 PM, Dmitriy Ryaboy <dv...@gmail.com>
> wrote:
> > > Hi All,
> > > I got sick and won't be able to make it. Would love to see some notes
> > > after the meeting :).
> > >
> > > D
> > >
> > > On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <ol...@yahoo-inc.com>
> > wrote:
> > >> Hi Guys,
> > >>
> > >> We are looking forward to see you tomorrow at 4 pm at Yahoo campus in
> > Sunnyvale.
> > >>
> > >> Yahoo address is
> > >>
> > >> 701 First Ave.
> > >> Sunnyvale, CA 94089
> > >>
> > >> We are in building E. Please, ask for Alan or me at the reception.
> > >>
> > >> Olga
> > >>
> > >> -----Original Message-----
> > >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> > >> Sent: Thursday, February 03, 2011 10:42 AM
> > >> To: dev@pig.apache.org
> > >> Subject: REMINDER: Pig developer meeting in February
> > >>
> > >> Hi guys,
> > >>
> > >> This is just a reminder that the meeting will be held next Wednesday,
> > 2/9 4-6 pm at Yahoo campus.
> > >>
> > >> If you have not yet responded but planning to attend, please, let me
> > know.
> > >>
> > >> Olga
> > >>
> > >> -----Original Message-----
> > >> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
> > >> Sent: Friday, January 28, 2011 3:36 PM
> > >> To: dev@pig.apache.org
> > >> Subject: RE: Pig developer meeting in February
> > >>
> > >> I am planning to attend.
> > >>
> > >> -----Original Message-----
> > >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> > >> Sent: Friday, January 28, 2011 12:58 PM
> > >> To: dev@pig.apache.org
> > >> Subject: RE: Pig developer meeting in February
> > >>
> > >> I believe we have critical mass so the meeting is on!
> > >>
> > >> If you have not responded yet but planning to attend, please, let me
> > know.
> > >>
> > >> Thanks,
> > >>
> > >> Olga
> > >>
> > >> -----Original Message-----
> > >> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
> > >> Sent: Thursday, January 27, 2011 5:21 PM
> > >> To: dev@pig.apache.org
> > >> Subject: Re: Pig developer meeting in February
> > >>
> > >> Me too.
> > >> Julien
> > >>
> > >>
> > >> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:
> > >>
> > >> Ok yeah I'll come :).
> > >>
> > >>
> > >>
> > >> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <ol...@yahoo-inc.com>
> > wrote:
> > >>
> > >>> While there is a lively discussion on this thread, I have not
> > >>> actually gotten any responses to having the meeting with exception of
> 1
> > person :).
> > >>>
> > >>> Please, let me know by the end of the week if you are planning to
> > attend.
> > >>> If we don't get at least a few more responses I suggest we postpone
> > >>> the meeting.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Olga
> > >>>
> > >>> -----Original Message-----
> > >>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> > >>> Sent: Wednesday, January 26, 2011 6:04 PM
> > >>> To: dev@pig.apache.org
> > >>> Subject: Re: Pig developer meeting in February
> > >>>
> > >>> Right, we do partition filtering, but not true predicate pushdown.
> > >>>
> > >>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <ji...@yahoo-inc.com>
> > >>> wrote:
> > >>>
> > >>> > Are you talking about LoadMetadata.setPartitionFilter?
> > >>> > PartitionFilterOptimizer will do that.
> > >>> >
> > >>> > Daniel
> > >>> >
> > >>> >
> > >>> > Dmitriy Ryaboy wrote:
> > >>> >
> > >>> >> I may be wrong but I think predicate pushdown is designed for,
> > >>> >> but not actually implemented in the current LoadPushdown
> > >>> >> interface (you can only push projections). If I am wrong, that's
> > >>> >> great.. but if not, that would
> > >>> be
> > >>> >> an important feature to add, as people are trying to connect Pig
> > >>> >> to "smart"
> > >>> >> storage systems like rdbmses, HBase, and Cassandra more and more.
> > >>> >> I
> > >>> think
> > >>> >> we only kind of simulate this with partition keys info, which is
> > >>> >> not always sufficient
> > >>> >>
> > >>> >> D
> > >>> >>
> > >>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
> > >>> >> <le...@yahoo-inc.com>
> > >>> >> wrote:
> > >>> >>
> > >>> >>
> > >>> >>
> > >>> >>> If making Pig Thread safe (i.e.: two threads running a different
> > >>> >>> pig
> > >>> >>> script) is important then we need to change some of the APIs
> > >>> >>> from
> > >>> static
> > >>> >>> singleton access to a dependency injection pattern.
> > >>> >>> In that case, this should probably be done before 1.0 For
> example:
> > >>> >>> UDFContext should be passed to the UDF after construction
> > >>> >>> (similar to the SevrletContext in Servlet or the way Hadoop
> > >>> >>> passes the context to tasks) Also a clearly separated API that
> > >>> >>> does not depend on the Pig implementation would help.
> > >>> >>> For example UDFContext is in org.apache.pig.impl.util when it
> > >>> >>> would be better in org.apache.pig.api (Or at least an interface
> > >>> >>> defining it)
> > >>> >>>
> > >>> >>> Julien
> > >>> >>>
> > >>> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com>
> wrote:
> > >>> >>>
> > >>> >>> Hi Guys,
> > >>> >>>
> > >>> >>> I think it is time for us to have another meeting. Yahoo would
> > >>> >>> be happy to host if this works for everybody. How about
> > >>> >>> Wednesday,
> > >>> >>> 2/9 4-6 pm.
> > >>> >>> Please,
> > >>> >>> let us know if you are planning to attend and if the date/time
> > >>> >>> works
> > >>> for
> > >>> >>> you.
> > >>> >>>
> > >>> >>> Things that come to mind to discuss and as always feel free to
> > >>> >>> suggest
> > >>> >>> others:
> > >>> >>>
> > >>> >>> -          Error handling proposal - this might be easier to
> > >>> >>> finalize face-to-face
> > >>> >>> -          Pig 0.9 plan
> > >>> >>> -          Pig Roadmap beyond 0.9 o        What do we want to do
> > >>> >>> in Pig.next?
> > >>> >>> o        Are we ready for Pig 1.0
> > >>> >>>
> > >>> >>> Olga
> > >>> >>>
> > >>> >>>
> > >>> >>>
> > >>> >>>
> > >>> >>
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>

Re: REMINDER: Pig developer meeting in February

Posted by Milind Bhandarkar <mb...@linkedin.com>.
On Feb 14, 2011, at 12:55 PM, arvind@cloudera.com wrote:
> 
>   - Support for different backends for Pig (MR, Piranha, Local, Oozie)
>      - Execution engine that can generate plans specific to the underlying
>      architecture and allow controlling routines to
> rewrite/re-optimize the plan
>      mid-execution.

+1 for Oozie backend.

>   - Allow execution to resume from where it left off after due to runtime
>   failure.
>      - May be done by allowing Oozie as a backend where the plan is
>      converted into an Oozie workflow.
>      - Alternatively Pig could delegate blocks of execution to Oozie.

See above.

This would be very beneficial for oozie (even more than pig), because no one should be made to program in xml ! The current workflow specification is a cruelty !

Pig has constructs to invoke hdfs commands, arbitrary jars, mapreduce codes already. If it has a hive execution mode, e.g.

A = hive("select xyz from etc...");

all of these could be handed off to oozie (I believe alejandro has a hive action for oozie already). Then there would be no "hive vs pig", instead pigs will really eat anything.

- milind

---
Milind Bhandarkar
mbhandarkar@linkedin.com




Re: REMINDER: Pig developer meeting in February

Posted by "arvind@cloudera.com" <ar...@cloudera.com>.
Hi,

Sorry for the delay in sending this. Following are the notes from the last
developer's meeting.

Arvind
-----------
*Attendees*

   - From Y!: Alan, Santosh, Romain, Daniel, Richard, Ashutosh, Ben, Julian
   - From Cloudera: Arvind

*Agenda*

   - Error Handling
   - Brainstorming Ideas For 0.9
   - Brainstorming Ideas Beyond 0.9

*Error Handling Suggestions/Proposal Discussion:*

   - Allow each statement to declare ONERROR clause with a UDF to handle the
   control in case of error.
      - This would be better than current behavior of exiting on error.
   - Alternatively, allow ONERROR to be declared for an entire
   script/session which would allow individual statements to override and
   provide a more specialized UDF for error handling.
   - Yet another alternative - allow the specification of a threshold number
   of errors that Pig ignores before exiting.
   - Key idea is to ensure that the error handling is focused on data error
   handling and not control-flow.
   - Action Item: Post the key proposal on the Wiki.

*Brainstorming Ideas For 0.9:*

   - Internal development done by March
   - Release tentatively by May
   - Support for ILLUSTRATE.
   - Current status:
      - Parser rewrite almost complete
      - Working on load data according to schema - support for padding
      missing values
      - No support for Boolean type planned yet.
   - Big features in 0.9
      - Parser change
      - Macro support
      - Jython/Script support
      - Penny (Formally Inspector Gadget): framework to instrument scripts.
      Allows detection of bad records that cause failures, implement
constraints.
         - Works by integrating with the optimizer to produce wrappers for
         key UDFs of interest.
         - Agents can be added in different parts of the query
         - Prepackaged agents available, but framework allows the creation
         of custom agents as needed.
         - Pending work - implementation of unit tests, and turning this
         into a patch.

*Brainstorming Ideas Beyond 0.9:*

   - Support for different backends for Pig (MR, Piranha, Local, Oozie)
      - Execution engine that can generate plans specific to the underlying
      architecture and allow controlling routines to
rewrite/re-optimize the plan
      mid-execution.
   - Thread safety when running local jobs - to allow better embedding of
   Pig as a light-weight tool in web-applications and other multi-threaded
   environments.
      - Work includes making UDF context thread-safe and removing statics
      from the implementation.
      - Will benefit Oozie and other systems that embed Pig without having
      to worry about side-effects.
   - Allow execution to resume from where it left off after due to runtime
   failure.
      - May be done by allowing Oozie as a backend where the plan is
      converted into an Oozie workflow.
      - Alternatively Pig could delegate blocks of execution to Oozie.
   - Scalability: Pig should support users who may not know the intricate
   details of the job/architecture. Things such as memory allocation, skew
   handling etc automatically without user involvement.
   - Allow pig to kill jobs already submitted if the shell exits due to a
   Control+C or other failures.
   - UDF 2.0 - simplify UDF interfaces, along with support for multiple
   versions of the UDF at the same time.


*General*

   - Loops in Pig: No direct support, but available indirectly by
   integration with scripting environments.
   - Would be good to allow Pig to be provisioned across the cluster for
   faster job startup.
   - Pig-pen: not under active development and not supported.


On Fri, Feb 11, 2011 at 6:30 PM, Santhosh Srinivasan <sm...@yahoo-inc.com>wrote:

> Arvind from Cloudera took excellent notes. You should see it next week
> after Alan gets a chance to review them.
>
> Santhosh
>
> -----Original Message-----
> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> Sent: Friday, February 11, 2011 5:34 PM
> To: dev@pig.apache.org
> Subject: Re: REMINDER: Pig developer meeting in February
>
> Hi folks,
> Any chance someone took notes? :)
>
> D
>
> On Tue, Feb 8, 2011 at 9:38 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
> > Hi All,
> > I got sick and won't be able to make it. Would love to see some notes
> > after the meeting :).
> >
> > D
> >
> > On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <ol...@yahoo-inc.com>
> wrote:
> >> Hi Guys,
> >>
> >> We are looking forward to see you tomorrow at 4 pm at Yahoo campus in
> Sunnyvale.
> >>
> >> Yahoo address is
> >>
> >> 701 First Ave.
> >> Sunnyvale, CA 94089
> >>
> >> We are in building E. Please, ask for Alan or me at the reception.
> >>
> >> Olga
> >>
> >> -----Original Message-----
> >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> >> Sent: Thursday, February 03, 2011 10:42 AM
> >> To: dev@pig.apache.org
> >> Subject: REMINDER: Pig developer meeting in February
> >>
> >> Hi guys,
> >>
> >> This is just a reminder that the meeting will be held next Wednesday,
> 2/9 4-6 pm at Yahoo campus.
> >>
> >> If you have not yet responded but planning to attend, please, let me
> know.
> >>
> >> Olga
> >>
> >> -----Original Message-----
> >> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
> >> Sent: Friday, January 28, 2011 3:36 PM
> >> To: dev@pig.apache.org
> >> Subject: RE: Pig developer meeting in February
> >>
> >> I am planning to attend.
> >>
> >> -----Original Message-----
> >> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> >> Sent: Friday, January 28, 2011 12:58 PM
> >> To: dev@pig.apache.org
> >> Subject: RE: Pig developer meeting in February
> >>
> >> I believe we have critical mass so the meeting is on!
> >>
> >> If you have not responded yet but planning to attend, please, let me
> know.
> >>
> >> Thanks,
> >>
> >> Olga
> >>
> >> -----Original Message-----
> >> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
> >> Sent: Thursday, January 27, 2011 5:21 PM
> >> To: dev@pig.apache.org
> >> Subject: Re: Pig developer meeting in February
> >>
> >> Me too.
> >> Julien
> >>
> >>
> >> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:
> >>
> >> Ok yeah I'll come :).
> >>
> >>
> >>
> >> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <ol...@yahoo-inc.com>
> wrote:
> >>
> >>> While there is a lively discussion on this thread, I have not
> >>> actually gotten any responses to having the meeting with exception of 1
> person :).
> >>>
> >>> Please, let me know by the end of the week if you are planning to
> attend.
> >>> If we don't get at least a few more responses I suggest we postpone
> >>> the meeting.
> >>>
> >>> Thanks,
> >>>
> >>> Olga
> >>>
> >>> -----Original Message-----
> >>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> >>> Sent: Wednesday, January 26, 2011 6:04 PM
> >>> To: dev@pig.apache.org
> >>> Subject: Re: Pig developer meeting in February
> >>>
> >>> Right, we do partition filtering, but not true predicate pushdown.
> >>>
> >>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <ji...@yahoo-inc.com>
> >>> wrote:
> >>>
> >>> > Are you talking about LoadMetadata.setPartitionFilter?
> >>> > PartitionFilterOptimizer will do that.
> >>> >
> >>> > Daniel
> >>> >
> >>> >
> >>> > Dmitriy Ryaboy wrote:
> >>> >
> >>> >> I may be wrong but I think predicate pushdown is designed for,
> >>> >> but not actually implemented in the current LoadPushdown
> >>> >> interface (you can only push projections). If I am wrong, that's
> >>> >> great.. but if not, that would
> >>> be
> >>> >> an important feature to add, as people are trying to connect Pig
> >>> >> to "smart"
> >>> >> storage systems like rdbmses, HBase, and Cassandra more and more.
> >>> >> I
> >>> think
> >>> >> we only kind of simulate this with partition keys info, which is
> >>> >> not always sufficient
> >>> >>
> >>> >> D
> >>> >>
> >>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
> >>> >> <le...@yahoo-inc.com>
> >>> >> wrote:
> >>> >>
> >>> >>
> >>> >>
> >>> >>> If making Pig Thread safe (i.e.: two threads running a different
> >>> >>> pig
> >>> >>> script) is important then we need to change some of the APIs
> >>> >>> from
> >>> static
> >>> >>> singleton access to a dependency injection pattern.
> >>> >>> In that case, this should probably be done before 1.0 For example:
> >>> >>> UDFContext should be passed to the UDF after construction
> >>> >>> (similar to the SevrletContext in Servlet or the way Hadoop
> >>> >>> passes the context to tasks) Also a clearly separated API that
> >>> >>> does not depend on the Pig implementation would help.
> >>> >>> For example UDFContext is in org.apache.pig.impl.util when it
> >>> >>> would be better in org.apache.pig.api (Or at least an interface
> >>> >>> defining it)
> >>> >>>
> >>> >>> Julien
> >>> >>>
> >>> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com> wrote:
> >>> >>>
> >>> >>> Hi Guys,
> >>> >>>
> >>> >>> I think it is time for us to have another meeting. Yahoo would
> >>> >>> be happy to host if this works for everybody. How about
> >>> >>> Wednesday,
> >>> >>> 2/9 4-6 pm.
> >>> >>> Please,
> >>> >>> let us know if you are planning to attend and if the date/time
> >>> >>> works
> >>> for
> >>> >>> you.
> >>> >>>
> >>> >>> Things that come to mind to discuss and as always feel free to
> >>> >>> suggest
> >>> >>> others:
> >>> >>>
> >>> >>> -          Error handling proposal - this might be easier to
> >>> >>> finalize face-to-face
> >>> >>> -          Pig 0.9 plan
> >>> >>> -          Pig Roadmap beyond 0.9 o        What do we want to do
> >>> >>> in Pig.next?
> >>> >>> o        Are we ready for Pig 1.0
> >>> >>>
> >>> >>> Olga
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>
> >>> >
> >>>
> >>
> >>
> >
>

RE: REMINDER: Pig developer meeting in February

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
Arvind from Cloudera took excellent notes. You should see it next week after Alan gets a chance to review them.

Santhosh

-----Original Message-----
From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com] 
Sent: Friday, February 11, 2011 5:34 PM
To: dev@pig.apache.org
Subject: Re: REMINDER: Pig developer meeting in February

Hi folks,
Any chance someone took notes? :)

D

On Tue, Feb 8, 2011 at 9:38 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
> Hi All,
> I got sick and won't be able to make it. Would love to see some notes 
> after the meeting :).
>
> D
>
> On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
>> Hi Guys,
>>
>> We are looking forward to see you tomorrow at 4 pm at Yahoo campus in Sunnyvale.
>>
>> Yahoo address is
>>
>> 701 First Ave.
>> Sunnyvale, CA 94089
>>
>> We are in building E. Please, ask for Alan or me at the reception.
>>
>> Olga
>>
>> -----Original Message-----
>> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
>> Sent: Thursday, February 03, 2011 10:42 AM
>> To: dev@pig.apache.org
>> Subject: REMINDER: Pig developer meeting in February
>>
>> Hi guys,
>>
>> This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus.
>>
>> If you have not yet responded but planning to attend, please, let me know.
>>
>> Olga
>>
>> -----Original Message-----
>> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
>> Sent: Friday, January 28, 2011 3:36 PM
>> To: dev@pig.apache.org
>> Subject: RE: Pig developer meeting in February
>>
>> I am planning to attend.
>>
>> -----Original Message-----
>> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
>> Sent: Friday, January 28, 2011 12:58 PM
>> To: dev@pig.apache.org
>> Subject: RE: Pig developer meeting in February
>>
>> I believe we have critical mass so the meeting is on!
>>
>> If you have not responded yet but planning to attend, please, let me know.
>>
>> Thanks,
>>
>> Olga
>>
>> -----Original Message-----
>> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
>> Sent: Thursday, January 27, 2011 5:21 PM
>> To: dev@pig.apache.org
>> Subject: Re: Pig developer meeting in February
>>
>> Me too.
>> Julien
>>
>>
>> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:
>>
>> Ok yeah I'll come :).
>>
>>
>>
>> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
>>
>>> While there is a lively discussion on this thread, I have not 
>>> actually gotten any responses to having the meeting with exception of 1 person :).
>>>
>>> Please, let me know by the end of the week if you are planning to attend.
>>> If we don't get at least a few more responses I suggest we postpone 
>>> the meeting.
>>>
>>> Thanks,
>>>
>>> Olga
>>>
>>> -----Original Message-----
>>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
>>> Sent: Wednesday, January 26, 2011 6:04 PM
>>> To: dev@pig.apache.org
>>> Subject: Re: Pig developer meeting in February
>>>
>>> Right, we do partition filtering, but not true predicate pushdown.
>>>
>>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <ji...@yahoo-inc.com>
>>> wrote:
>>>
>>> > Are you talking about LoadMetadata.setPartitionFilter?
>>> > PartitionFilterOptimizer will do that.
>>> >
>>> > Daniel
>>> >
>>> >
>>> > Dmitriy Ryaboy wrote:
>>> >
>>> >> I may be wrong but I think predicate pushdown is designed for, 
>>> >> but not actually implemented in the current LoadPushdown 
>>> >> interface (you can only push projections). If I am wrong, that's 
>>> >> great.. but if not, that would
>>> be
>>> >> an important feature to add, as people are trying to connect Pig 
>>> >> to "smart"
>>> >> storage systems like rdbmses, HBase, and Cassandra more and more.
>>> >> I
>>> think
>>> >> we only kind of simulate this with partition keys info, which is 
>>> >> not always sufficient
>>> >>
>>> >> D
>>> >>
>>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem 
>>> >> <le...@yahoo-inc.com>
>>> >> wrote:
>>> >>
>>> >>
>>> >>
>>> >>> If making Pig Thread safe (i.e.: two threads running a different 
>>> >>> pig
>>> >>> script) is important then we need to change some of the APIs 
>>> >>> from
>>> static
>>> >>> singleton access to a dependency injection pattern.
>>> >>> In that case, this should probably be done before 1.0 For example:
>>> >>> UDFContext should be passed to the UDF after construction 
>>> >>> (similar to the SevrletContext in Servlet or the way Hadoop 
>>> >>> passes the context to tasks) Also a clearly separated API that 
>>> >>> does not depend on the Pig implementation would help.
>>> >>> For example UDFContext is in org.apache.pig.impl.util when it 
>>> >>> would be better in org.apache.pig.api (Or at least an interface 
>>> >>> defining it)
>>> >>>
>>> >>> Julien
>>> >>>
>>> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com> wrote:
>>> >>>
>>> >>> Hi Guys,
>>> >>>
>>> >>> I think it is time for us to have another meeting. Yahoo would 
>>> >>> be happy to host if this works for everybody. How about 
>>> >>> Wednesday,
>>> >>> 2/9 4-6 pm.
>>> >>> Please,
>>> >>> let us know if you are planning to attend and if the date/time 
>>> >>> works
>>> for
>>> >>> you.
>>> >>>
>>> >>> Things that come to mind to discuss and as always feel free to 
>>> >>> suggest
>>> >>> others:
>>> >>>
>>> >>> -          Error handling proposal - this might be easier to 
>>> >>> finalize face-to-face
>>> >>> -          Pig 0.9 plan
>>> >>> -          Pig Roadmap beyond 0.9 o        What do we want to do 
>>> >>> in Pig.next?
>>> >>> o        Are we ready for Pig 1.0
>>> >>>
>>> >>> Olga
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>
>>> >
>>>
>>
>>
>

Re: REMINDER: Pig developer meeting in February

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Hi folks,
Any chance someone took notes? :)

D

On Tue, Feb 8, 2011 at 9:38 PM, Dmitriy Ryaboy <dv...@gmail.com> wrote:
> Hi All,
> I got sick and won't be able to make it. Would love to see some notes
> after the meeting :).
>
> D
>
> On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
>> Hi Guys,
>>
>> We are looking forward to see you tomorrow at 4 pm at Yahoo campus in Sunnyvale.
>>
>> Yahoo address is
>>
>> 701 First Ave.
>> Sunnyvale, CA 94089
>>
>> We are in building E. Please, ask for Alan or me at the reception.
>>
>> Olga
>>
>> -----Original Message-----
>> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
>> Sent: Thursday, February 03, 2011 10:42 AM
>> To: dev@pig.apache.org
>> Subject: REMINDER: Pig developer meeting in February
>>
>> Hi guys,
>>
>> This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus.
>>
>> If you have not yet responded but planning to attend, please, let me know.
>>
>> Olga
>>
>> -----Original Message-----
>> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
>> Sent: Friday, January 28, 2011 3:36 PM
>> To: dev@pig.apache.org
>> Subject: RE: Pig developer meeting in February
>>
>> I am planning to attend.
>>
>> -----Original Message-----
>> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
>> Sent: Friday, January 28, 2011 12:58 PM
>> To: dev@pig.apache.org
>> Subject: RE: Pig developer meeting in February
>>
>> I believe we have critical mass so the meeting is on!
>>
>> If you have not responded yet but planning to attend, please, let me know.
>>
>> Thanks,
>>
>> Olga
>>
>> -----Original Message-----
>> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
>> Sent: Thursday, January 27, 2011 5:21 PM
>> To: dev@pig.apache.org
>> Subject: Re: Pig developer meeting in February
>>
>> Me too.
>> Julien
>>
>>
>> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:
>>
>> Ok yeah I'll come :).
>>
>>
>>
>> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
>>
>>> While there is a lively discussion on this thread, I have not actually
>>> gotten any responses to having the meeting with exception of 1 person :).
>>>
>>> Please, let me know by the end of the week if you are planning to attend.
>>> If we don't get at least a few more responses I suggest we postpone
>>> the meeting.
>>>
>>> Thanks,
>>>
>>> Olga
>>>
>>> -----Original Message-----
>>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
>>> Sent: Wednesday, January 26, 2011 6:04 PM
>>> To: dev@pig.apache.org
>>> Subject: Re: Pig developer meeting in February
>>>
>>> Right, we do partition filtering, but not true predicate pushdown.
>>>
>>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <ji...@yahoo-inc.com>
>>> wrote:
>>>
>>> > Are you talking about LoadMetadata.setPartitionFilter?
>>> > PartitionFilterOptimizer will do that.
>>> >
>>> > Daniel
>>> >
>>> >
>>> > Dmitriy Ryaboy wrote:
>>> >
>>> >> I may be wrong but I think predicate pushdown is designed for, but
>>> >> not actually implemented in the current LoadPushdown interface (you
>>> >> can only push projections). If I am wrong, that's great.. but if
>>> >> not, that would
>>> be
>>> >> an important feature to add, as people are trying to connect Pig to
>>> >> "smart"
>>> >> storage systems like rdbmses, HBase, and Cassandra more and more.
>>> >> I
>>> think
>>> >> we only kind of simulate this with partition keys info, which is
>>> >> not always sufficient
>>> >>
>>> >> D
>>> >>
>>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
>>> >> <le...@yahoo-inc.com>
>>> >> wrote:
>>> >>
>>> >>
>>> >>
>>> >>> If making Pig Thread safe (i.e.: two threads running a different
>>> >>> pig
>>> >>> script) is important then we need to change some of the APIs from
>>> static
>>> >>> singleton access to a dependency injection pattern.
>>> >>> In that case, this should probably be done before 1.0 For example:
>>> >>> UDFContext should be passed to the UDF after construction (similar
>>> >>> to the SevrletContext in Servlet or the way Hadoop passes the
>>> >>> context to tasks) Also a clearly separated API that does not
>>> >>> depend on the Pig implementation would help.
>>> >>> For example UDFContext is in org.apache.pig.impl.util when it
>>> >>> would be better in org.apache.pig.api (Or at least an interface
>>> >>> defining it)
>>> >>>
>>> >>> Julien
>>> >>>
>>> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com> wrote:
>>> >>>
>>> >>> Hi Guys,
>>> >>>
>>> >>> I think it is time for us to have another meeting. Yahoo would be
>>> >>> happy to host if this works for everybody. How about Wednesday,
>>> >>> 2/9 4-6 pm.
>>> >>> Please,
>>> >>> let us know if you are planning to attend and if the date/time
>>> >>> works
>>> for
>>> >>> you.
>>> >>>
>>> >>> Things that come to mind to discuss and as always feel free to
>>> >>> suggest
>>> >>> others:
>>> >>>
>>> >>> -          Error handling proposal - this might be easier to finalize
>>> >>> face-to-face
>>> >>> -          Pig 0.9 plan
>>> >>> -          Pig Roadmap beyond 0.9
>>> >>> o        What do we want to do in Pig.next?
>>> >>> o        Are we ready for Pig 1.0
>>> >>>
>>> >>> Olga
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>
>>> >
>>>
>>
>>
>

Re: REMINDER: Pig developer meeting in February

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Hi All,
I got sick and won't be able to make it. Would love to see some notes
after the meeting :).

D

On Tue, Feb 8, 2011 at 10:29 AM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
> Hi Guys,
>
> We are looking forward to see you tomorrow at 4 pm at Yahoo campus in Sunnyvale.
>
> Yahoo address is
>
> 701 First Ave.
> Sunnyvale, CA 94089
>
> We are in building E. Please, ask for Alan or me at the reception.
>
> Olga
>
> -----Original Message-----
> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> Sent: Thursday, February 03, 2011 10:42 AM
> To: dev@pig.apache.org
> Subject: REMINDER: Pig developer meeting in February
>
> Hi guys,
>
> This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus.
>
> If you have not yet responded but planning to attend, please, let me know.
>
> Olga
>
> -----Original Message-----
> From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com]
> Sent: Friday, January 28, 2011 3:36 PM
> To: dev@pig.apache.org
> Subject: RE: Pig developer meeting in February
>
> I am planning to attend.
>
> -----Original Message-----
> From: Olga Natkovich [mailto:olgan@yahoo-inc.com]
> Sent: Friday, January 28, 2011 12:58 PM
> To: dev@pig.apache.org
> Subject: RE: Pig developer meeting in February
>
> I believe we have critical mass so the meeting is on!
>
> If you have not responded yet but planning to attend, please, let me know.
>
> Thanks,
>
> Olga
>
> -----Original Message-----
> From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
> Sent: Thursday, January 27, 2011 5:21 PM
> To: dev@pig.apache.org
> Subject: Re: Pig developer meeting in February
>
> Me too.
> Julien
>
>
> On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:
>
> Ok yeah I'll come :).
>
>
>
> On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
>
>> While there is a lively discussion on this thread, I have not actually
>> gotten any responses to having the meeting with exception of 1 person :).
>>
>> Please, let me know by the end of the week if you are planning to attend.
>> If we don't get at least a few more responses I suggest we postpone
>> the meeting.
>>
>> Thanks,
>>
>> Olga
>>
>> -----Original Message-----
>> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
>> Sent: Wednesday, January 26, 2011 6:04 PM
>> To: dev@pig.apache.org
>> Subject: Re: Pig developer meeting in February
>>
>> Right, we do partition filtering, but not true predicate pushdown.
>>
>> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <ji...@yahoo-inc.com>
>> wrote:
>>
>> > Are you talking about LoadMetadata.setPartitionFilter?
>> > PartitionFilterOptimizer will do that.
>> >
>> > Daniel
>> >
>> >
>> > Dmitriy Ryaboy wrote:
>> >
>> >> I may be wrong but I think predicate pushdown is designed for, but
>> >> not actually implemented in the current LoadPushdown interface (you
>> >> can only push projections). If I am wrong, that's great.. but if
>> >> not, that would
>> be
>> >> an important feature to add, as people are trying to connect Pig to
>> >> "smart"
>> >> storage systems like rdbmses, HBase, and Cassandra more and more.
>> >> I
>> think
>> >> we only kind of simulate this with partition keys info, which is
>> >> not always sufficient
>> >>
>> >> D
>> >>
>> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem
>> >> <le...@yahoo-inc.com>
>> >> wrote:
>> >>
>> >>
>> >>
>> >>> If making Pig Thread safe (i.e.: two threads running a different
>> >>> pig
>> >>> script) is important then we need to change some of the APIs from
>> static
>> >>> singleton access to a dependency injection pattern.
>> >>> In that case, this should probably be done before 1.0 For example:
>> >>> UDFContext should be passed to the UDF after construction (similar
>> >>> to the SevrletContext in Servlet or the way Hadoop passes the
>> >>> context to tasks) Also a clearly separated API that does not
>> >>> depend on the Pig implementation would help.
>> >>> For example UDFContext is in org.apache.pig.impl.util when it
>> >>> would be better in org.apache.pig.api (Or at least an interface
>> >>> defining it)
>> >>>
>> >>> Julien
>> >>>
>> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com> wrote:
>> >>>
>> >>> Hi Guys,
>> >>>
>> >>> I think it is time for us to have another meeting. Yahoo would be
>> >>> happy to host if this works for everybody. How about Wednesday,
>> >>> 2/9 4-6 pm.
>> >>> Please,
>> >>> let us know if you are planning to attend and if the date/time
>> >>> works
>> for
>> >>> you.
>> >>>
>> >>> Things that come to mind to discuss and as always feel free to
>> >>> suggest
>> >>> others:
>> >>>
>> >>> -          Error handling proposal - this might be easier to finalize
>> >>> face-to-face
>> >>> -          Pig 0.9 plan
>> >>> -          Pig Roadmap beyond 0.9
>> >>> o        What do we want to do in Pig.next?
>> >>> o        Are we ready for Pig 1.0
>> >>>
>> >>> Olga
>> >>>
>> >>>
>> >>>
>> >>>
>> >>
>> >
>>
>
>

RE: REMINDER: Pig developer meeting in February

Posted by Olga Natkovich <ol...@yahoo-inc.com>.
Hi Guys,

We are looking forward to see you tomorrow at 4 pm at Yahoo campus in Sunnyvale.

Yahoo address is

701 First Ave.
Sunnyvale, CA 94089

We are in building E. Please, ask for Alan or me at the reception.

Olga

-----Original Message-----
From: Olga Natkovich [mailto:olgan@yahoo-inc.com] 
Sent: Thursday, February 03, 2011 10:42 AM
To: dev@pig.apache.org
Subject: REMINDER: Pig developer meeting in February

Hi guys,

This is just a reminder that the meeting will be held next Wednesday, 2/9 4-6 pm at Yahoo campus.

If you have not yet responded but planning to attend, please, let me know.

Olga

-----Original Message-----
From: Santhosh Srinivasan [mailto:sms@yahoo-inc.com] 
Sent: Friday, January 28, 2011 3:36 PM
To: dev@pig.apache.org
Subject: RE: Pig developer meeting in February

I am planning to attend. 

-----Original Message-----
From: Olga Natkovich [mailto:olgan@yahoo-inc.com] 
Sent: Friday, January 28, 2011 12:58 PM
To: dev@pig.apache.org
Subject: RE: Pig developer meeting in February

I believe we have critical mass so the meeting is on!

If you have not responded yet but planning to attend, please, let me know.

Thanks,

Olga

-----Original Message-----
From: Julien Le Dem [mailto:ledemj@yahoo-inc.com]
Sent: Thursday, January 27, 2011 5:21 PM
To: dev@pig.apache.org
Subject: Re: Pig developer meeting in February

Me too.
Julien


On 1/27/11 4:09 PM, "Dmitriy Ryaboy" <dv...@gmail.com> wrote:

Ok yeah I'll come :).



On Thu, Jan 27, 2011 at 3:17 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:

> While there is a lively discussion on this thread, I have not actually 
> gotten any responses to having the meeting with exception of 1 person :).
>
> Please, let me know by the end of the week if you are planning to attend.
> If we don't get at least a few more responses I suggest we postpone 
> the meeting.
>
> Thanks,
>
> Olga
>
> -----Original Message-----
> From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com]
> Sent: Wednesday, January 26, 2011 6:04 PM
> To: dev@pig.apache.org
> Subject: Re: Pig developer meeting in February
>
> Right, we do partition filtering, but not true predicate pushdown.
>
> On Wed, Jan 26, 2011 at 5:59 PM, Daniel Dai <ji...@yahoo-inc.com>
> wrote:
>
> > Are you talking about LoadMetadata.setPartitionFilter?
> > PartitionFilterOptimizer will do that.
> >
> > Daniel
> >
> >
> > Dmitriy Ryaboy wrote:
> >
> >> I may be wrong but I think predicate pushdown is designed for, but 
> >> not actually implemented in the current LoadPushdown interface (you 
> >> can only push projections). If I am wrong, that's great.. but if 
> >> not, that would
> be
> >> an important feature to add, as people are trying to connect Pig to 
> >> "smart"
> >> storage systems like rdbmses, HBase, and Cassandra more and more.  
> >> I
> think
> >> we only kind of simulate this with partition keys info, which is 
> >> not always sufficient
> >>
> >> D
> >>
> >> On Wed, Jan 26, 2011 at 2:41 PM, Julien Le Dem 
> >> <le...@yahoo-inc.com>
> >> wrote:
> >>
> >>
> >>
> >>> If making Pig Thread safe (i.e.: two threads running a different 
> >>> pig
> >>> script) is important then we need to change some of the APIs from
> static
> >>> singleton access to a dependency injection pattern.
> >>> In that case, this should probably be done before 1.0 For example: 
> >>> UDFContext should be passed to the UDF after construction (similar 
> >>> to the SevrletContext in Servlet or the way Hadoop passes the 
> >>> context to tasks) Also a clearly separated API that does not 
> >>> depend on the Pig implementation would help.
> >>> For example UDFContext is in org.apache.pig.impl.util when it 
> >>> would be better in org.apache.pig.api (Or at least an interface 
> >>> defining it)
> >>>
> >>> Julien
> >>>
> >>> On 1/24/11 10:14 AM, "Olga Natkovich" <ol...@yahoo-inc.com> wrote:
> >>>
> >>> Hi Guys,
> >>>
> >>> I think it is time for us to have another meeting. Yahoo would be 
> >>> happy to host if this works for everybody. How about Wednesday, 
> >>> 2/9 4-6 pm.
> >>> Please,
> >>> let us know if you are planning to attend and if the date/time 
> >>> works
> for
> >>> you.
> >>>
> >>> Things that come to mind to discuss and as always feel free to 
> >>> suggest
> >>> others:
> >>>
> >>> -          Error handling proposal - this might be easier to finalize
> >>> face-to-face
> >>> -          Pig 0.9 plan
> >>> -          Pig Roadmap beyond 0.9
> >>> o        What do we want to do in Pig.next?
> >>> o        Are we ready for Pig 1.0
> >>>
> >>> Olga
> >>>
> >>>
> >>>
> >>>
> >>
> >
>