You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Alan Gates <ga...@yahoo-inc.com> on 2008/10/10 18:24:06 UTC

Pig rework on the types branch

All,

As you have probably noticed if you've been watching the mailing  
list, much work has gone into an almost complete rework of pig over  
the last six months.  This work has been done on the types branch in  
order to avoid destabilizing the trunk.  This work includes a  
complete rewrite of the backend of pig, including the interface to  
map reduce and the operators that execute a pig script on hadoop.  It  
also introduces a type system to pig.  A number of new features have  
been added and performance has been significantly improved (averaging  
around 2x though varying greatly by script).  And, while we strove to  
be backward compatible whenever possible, there are places where  
changes are required in user scripts or UDFs.  Full details of the  
changes are available at http://wiki.apache.org/pig/TrunkToTypesChanges

After much testing by the developers and a number of brave users, we  
feel the code on the types branch is now approaching stability.  We  
would like to suggest that users begin using the code on the types  
branch.  At some point in the near future, we would like to merge the  
types branch into trunk and then do a 0.2.0 release.

Alan.

RE: Pig rework on the types branch

Posted by Olga Natkovich <ol...@yahoo-inc.com>.
Hi Ian,

We are still porting the UDFs to types branch. I am hoping to be able to
commit that later next week. Once it is there, you should be able to
part your contributions to types branch.

Olga 

> -----Original Message-----
> From: Ian Holsman [mailto:lists@holsman.net] 
> Sent: Thursday, October 16, 2008 8:39 PM
> To: pig-dev@incubator.apache.org
> Cc: pig-user@incubator.apache.org
> Subject: Re: Pig rework on the types branch
> 
> are you after contributions/samples for piggybank on this branch?
> 
> I recently created a non-regex based custom apache-logfile 
> loader for the types branch which might be helpful to others, 
> albeit it does currently not use the CLF format.
> 
> Alan Gates wrote:
> > All,
> >
> > As you have probably noticed if you've been watching the 
> mailing list, 
> > much work has gone into an almost complete rework of pig 
> over the last 
> > six months.  This work has been done on the types branch in 
> order to 
> > avoid destabilizing the trunk.  This work includes a 
> complete rewrite 
> > of the backend of pig, including the interface to map 
> reduce and the 
> > operators that execute a pig script on hadoop.  It also 
> introduces a 
> > type system to pig.  A number of new features have been added and 
> > performance has been significantly improved (averaging around 2x 
> > though varying greatly by script).  And, while we strove to be 
> > backward compatible whenever possible, there are places 
> where changes 
> > are required in user scripts or UDFs.  Full details of the 
> changes are 
> > available at http://wiki.apache.org/pig/TrunkToTypesChanges
> >
> > After much testing by the developers and a number of brave 
> users, we 
> > feel the code on the types branch is now approaching stability.  We 
> > would like to suggest that users begin using the code on the types 
> > branch.  At some point in the near future, we would like to 
> merge the 
> > types branch into trunk and then do a 0.2.0 release.
> >
> > Alan.
> >
> 
> 

RE: Pig rework on the types branch

Posted by Olga Natkovich <ol...@yahoo-inc.com>.
Hi Ian,

We are still porting the UDFs to types branch. I am hoping to be able to
commit that later next week. Once it is there, you should be able to
part your contributions to types branch.

Olga 

> -----Original Message-----
> From: Ian Holsman [mailto:lists@holsman.net] 
> Sent: Thursday, October 16, 2008 8:39 PM
> To: pig-dev@incubator.apache.org
> Cc: pig-user@incubator.apache.org
> Subject: Re: Pig rework on the types branch
> 
> are you after contributions/samples for piggybank on this branch?
> 
> I recently created a non-regex based custom apache-logfile 
> loader for the types branch which might be helpful to others, 
> albeit it does currently not use the CLF format.
> 
> Alan Gates wrote:
> > All,
> >
> > As you have probably noticed if you've been watching the 
> mailing list, 
> > much work has gone into an almost complete rework of pig 
> over the last 
> > six months.  This work has been done on the types branch in 
> order to 
> > avoid destabilizing the trunk.  This work includes a 
> complete rewrite 
> > of the backend of pig, including the interface to map 
> reduce and the 
> > operators that execute a pig script on hadoop.  It also 
> introduces a 
> > type system to pig.  A number of new features have been added and 
> > performance has been significantly improved (averaging around 2x 
> > though varying greatly by script).  And, while we strove to be 
> > backward compatible whenever possible, there are places 
> where changes 
> > are required in user scripts or UDFs.  Full details of the 
> changes are 
> > available at http://wiki.apache.org/pig/TrunkToTypesChanges
> >
> > After much testing by the developers and a number of brave 
> users, we 
> > feel the code on the types branch is now approaching stability.  We 
> > would like to suggest that users begin using the code on the types 
> > branch.  At some point in the near future, we would like to 
> merge the 
> > types branch into trunk and then do a 0.2.0 release.
> >
> > Alan.
> >
> 
> 

Re: Pig rework on the types branch

Posted by Ian Holsman <li...@holsman.net>.
are you after contributions/samples for piggybank on this branch?

I recently created a non-regex based custom apache-logfile loader for 
the types branch which might be helpful to others, albeit it does 
currently not use the CLF format.

Alan Gates wrote:
> All,
>
> As you have probably noticed if you've been watching the mailing list, 
> much work has gone into an almost complete rework of pig over the last 
> six months.  This work has been done on the types branch in order to 
> avoid destabilizing the trunk.  This work includes a complete rewrite 
> of the backend of pig, including the interface to map reduce and the 
> operators that execute a pig script on hadoop.  It also introduces a 
> type system to pig.  A number of new features have been added and 
> performance has been significantly improved (averaging around 2x 
> though varying greatly by script).  And, while we strove to be 
> backward compatible whenever possible, there are places where changes 
> are required in user scripts or UDFs.  Full details of the changes are 
> available at http://wiki.apache.org/pig/TrunkToTypesChanges
>
> After much testing by the developers and a number of brave users, we 
> feel the code on the types branch is now approaching stability.  We 
> would like to suggest that users begin using the code on the types 
> branch.  At some point in the near future, we would like to merge the 
> types branch into trunk and then do a 0.2.0 release.
>
> Alan.
>


Re: Pig rework on the types branch

Posted by Ian Holsman <li...@holsman.net>.
are you after contributions/samples for piggybank on this branch?

I recently created a non-regex based custom apache-logfile loader for 
the types branch which might be helpful to others, albeit it does 
currently not use the CLF format.

Alan Gates wrote:
> All,
>
> As you have probably noticed if you've been watching the mailing list, 
> much work has gone into an almost complete rework of pig over the last 
> six months.  This work has been done on the types branch in order to 
> avoid destabilizing the trunk.  This work includes a complete rewrite 
> of the backend of pig, including the interface to map reduce and the 
> operators that execute a pig script on hadoop.  It also introduces a 
> type system to pig.  A number of new features have been added and 
> performance has been significantly improved (averaging around 2x 
> though varying greatly by script).  And, while we strove to be 
> backward compatible whenever possible, there are places where changes 
> are required in user scripts or UDFs.  Full details of the changes are 
> available at http://wiki.apache.org/pig/TrunkToTypesChanges
>
> After much testing by the developers and a number of brave users, we 
> feel the code on the types branch is now approaching stability.  We 
> would like to suggest that users begin using the code on the types 
> branch.  At some point in the near future, we would like to merge the 
> types branch into trunk and then do a 0.2.0 release.
>
> Alan.
>