You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Dmitriy Ryaboy <dv...@gmail.com> on 2011/08/01 21:17:39 UTC

COUNT vs COUNT_STAR

The COUNT_STAR thing bites people a lot -- clearly, even the most advanced
Pig users mess this up once in a while. It's a really hard bug to track
down. We should reconsider our decision to make COUNT work the way it does.

D

On Mon, Aug 1, 2011 at 10:54 AM, Daniel Dai <da...@hortonworks.com> wrote:

> On Sun, Jul 31, 2011 at 7:06 PM, Raghu Angadi <an...@gmail.com> wrote:
>
> > great to see major user facing features. Thanks guys.
> >
> > Will we see some standard macros (e.g. rowcount()) similar to standard
> > UDFs?
> >
> > Even rowcount may not be trivial for casual user to do correctly. Should
> > rowcount() example in the blog should COUNT_STAR() rather than COUNT()?
> >
>
> Yes, thanks pointing it out. I put a note on the blog.
>
>
> >
> > Raghu.
> >
> > On Fri, Jul 29, 2011 at 2:02 PM, Daniel Dai <da...@hortonworks.com>
> wrote:
> >
> > > We wrote a serial of blogs to describe the new feature of Pig 0.9.0 on
> > > http://www.hortonworks.com/blog/. This serial contains three blogs and
> > > will
> > > be published in a few days.
> > >
> > > Thanks
> > > Daniel
> > >
> > > On Fri, Jul 29, 2011 at 1:25 PM, Olga Natkovich <ol...@yahoo-inc.com>
> > > wrote:
> > >
> > > > Pig  team is happy to announce Pig 0.9.0 release.
> > > >
> > > > Apache Pig provides a high-level data-flow language and execution
> > > framework
> > > > for parallel computation on Hadoop clusters. More details about Pig
> can
> > > be
> > > > found at http://pig.apache.org/.
> > > >
> > > > The highlights of this release are introduction of control
> structures,
> > > > change of query parser, and semantic cleanup. The details of the
> > release
> > > can
> > > > be found at http://pig.apache.org/releases.html.
> > > >
> > > > Olga
> > > >
> > > >
> > >
> >
>

Re: COUNT vs COUNT_STAR

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
I bet we'd make more pig scripts correctly than we would break.

D

On Sat, Aug 20, 2011 at 8:21 AM, Alan Gates <ga...@hortonworks.com> wrote:

> I don't see how we do this now without creating backward compatibility
> nightmares.
>
> Alan.
>
> On Aug 1, 2011, at 12:17 PM, Dmitriy Ryaboy wrote:
>
> > The COUNT_STAR thing bites people a lot -- clearly, even the most
> advanced
> > Pig users mess this up once in a while. It's a really hard bug to track
> > down. We should reconsider our decision to make COUNT work the way it
> does.
> >
> > D
> >
> > On Mon, Aug 1, 2011 at 10:54 AM, Daniel Dai <da...@hortonworks.com>
> wrote:
> >
> >> On Sun, Jul 31, 2011 at 7:06 PM, Raghu Angadi <an...@gmail.com> wrote:
> >>
> >>> great to see major user facing features. Thanks guys.
> >>>
> >>> Will we see some standard macros (e.g. rowcount()) similar to standard
> >>> UDFs?
> >>>
> >>> Even rowcount may not be trivial for casual user to do correctly.
> Should
> >>> rowcount() example in the blog should COUNT_STAR() rather than COUNT()?
> >>>
> >>
> >> Yes, thanks pointing it out. I put a note on the blog.
> >>
> >>
> >>>
> >>> Raghu.
> >>>
> >>> On Fri, Jul 29, 2011 at 2:02 PM, Daniel Dai <da...@hortonworks.com>
> >> wrote:
> >>>
> >>>> We wrote a serial of blogs to describe the new feature of Pig 0.9.0 on
> >>>> http://www.hortonworks.com/blog/. This serial contains three blogs
> and
> >>>> will
> >>>> be published in a few days.
> >>>>
> >>>> Thanks
> >>>> Daniel
> >>>>
> >>>> On Fri, Jul 29, 2011 at 1:25 PM, Olga Natkovich <ol...@yahoo-inc.com>
> >>>> wrote:
> >>>>
> >>>>> Pig  team is happy to announce Pig 0.9.0 release.
> >>>>>
> >>>>> Apache Pig provides a high-level data-flow language and execution
> >>>> framework
> >>>>> for parallel computation on Hadoop clusters. More details about Pig
> >> can
> >>>> be
> >>>>> found at http://pig.apache.org/.
> >>>>>
> >>>>> The highlights of this release are introduction of control
> >> structures,
> >>>>> change of query parser, and semantic cleanup. The details of the
> >>> release
> >>>> can
> >>>>> be found at http://pig.apache.org/releases.html.
> >>>>>
> >>>>> Olga
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: COUNT vs COUNT_STAR

Posted by Alan Gates <ga...@hortonworks.com>.
I don't see how we do this now without creating backward compatibility nightmares.

Alan.

On Aug 1, 2011, at 12:17 PM, Dmitriy Ryaboy wrote:

> The COUNT_STAR thing bites people a lot -- clearly, even the most advanced
> Pig users mess this up once in a while. It's a really hard bug to track
> down. We should reconsider our decision to make COUNT work the way it does.
> 
> D
> 
> On Mon, Aug 1, 2011 at 10:54 AM, Daniel Dai <da...@hortonworks.com> wrote:
> 
>> On Sun, Jul 31, 2011 at 7:06 PM, Raghu Angadi <an...@gmail.com> wrote:
>> 
>>> great to see major user facing features. Thanks guys.
>>> 
>>> Will we see some standard macros (e.g. rowcount()) similar to standard
>>> UDFs?
>>> 
>>> Even rowcount may not be trivial for casual user to do correctly. Should
>>> rowcount() example in the blog should COUNT_STAR() rather than COUNT()?
>>> 
>> 
>> Yes, thanks pointing it out. I put a note on the blog.
>> 
>> 
>>> 
>>> Raghu.
>>> 
>>> On Fri, Jul 29, 2011 at 2:02 PM, Daniel Dai <da...@hortonworks.com>
>> wrote:
>>> 
>>>> We wrote a serial of blogs to describe the new feature of Pig 0.9.0 on
>>>> http://www.hortonworks.com/blog/. This serial contains three blogs and
>>>> will
>>>> be published in a few days.
>>>> 
>>>> Thanks
>>>> Daniel
>>>> 
>>>> On Fri, Jul 29, 2011 at 1:25 PM, Olga Natkovich <ol...@yahoo-inc.com>
>>>> wrote:
>>>> 
>>>>> Pig  team is happy to announce Pig 0.9.0 release.
>>>>> 
>>>>> Apache Pig provides a high-level data-flow language and execution
>>>> framework
>>>>> for parallel computation on Hadoop clusters. More details about Pig
>> can
>>>> be
>>>>> found at http://pig.apache.org/.
>>>>> 
>>>>> The highlights of this release are introduction of control
>> structures,
>>>>> change of query parser, and semantic cleanup. The details of the
>>> release
>>>> can
>>>>> be found at http://pig.apache.org/releases.html.
>>>>> 
>>>>> Olga
>>>>> 
>>>>> 
>>>> 
>>> 
>>