You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Dave Wellman <da...@tynt.com> on 2010/08/19 00:29:58 UTC

Tuple compare

All,

I have what should be a simple problem.  I have 2 tuples that are chararrays t1, t2 and want to do a comparision.  using 

x = FILTER y BY (t1 == t2);

results in zero (0) records.

x = FILTER y BY (t1 != t2);

is zero records.  And  

x = FILTER y BY (t1 matches t2);

is an error.  Ideal would be a StrComp(t1, t2) filter func.

Is there a UDF for that?

Cheers,

Re: Tuple compare

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Something strange is happening with your data. Can you provide an example?

I just tried this, with both Pig 6 and Pig 8 (trunk):

grunt> strings = load 'tmp/strings.txt' as (a:chararray, b:chararray);
grunt> dump strings;
(foo,bar)
(foo,baz)
(foo,foo)
(bar,bar)
grunt> x = filter strings by a == b;
grunt> dump x;
(foo,foo)
(bar,bar)
grunt> x = filter strings by a != b;
grunt> dump x
(foo,bar)
(foo,baz)



On Wed, Aug 18, 2010 at 4:22 PM, Dave Wellman <da...@tynt.com> wrote:

> Because I wasn't able to find one I tossed this UDF into the mix.
>
> public class StrComp extends EvalFunc<Integer> {
>
>        @Override
>        public Integer exec(Tuple arg0) throws IOException {
>                // should have 2 tuples.
>                if (arg0.size() != 2) {
>                        throw new IOException("Dude where's my tuples?");
>                }
>
>                return
> arg0.get(0).toString().compareTo(arg0.get(1).toString());
>        }
> }
>
> And the pig calls:
>
> x = FILTER y BY StrComp(a, b) == 0;
>
> or
>
> x = FILTER y BY StrComp(a, b) != 0;
>
> The tuples a and b are chararray.  My solution "works" but a nice standard
> piggy bank udfs would be better.
>
>
> On Aug 18, 2010, at 4:56 PM, Dmitriy Ryaboy wrote:
>
> > Dave,
> > Can you provide some sample data? A tuple can't be a chararray (but it
> can
> > contain one), so I want to make sure I understand what the data you are
> > working with looks like.
> >
> > -Dmitriy
> >
> > On Wed, Aug 18, 2010 at 3:29 PM, Dave Wellman <da...@tynt.com> wrote:
> >
> >> All,
> >>
> >> I have what should be a simple problem.  I have 2 tuples that are
> >> chararrays t1, t2 and want to do a comparision.  using
> >>
> >> x = FILTER y BY (t1 == t2);
> >>
> >> results in zero (0) records.
> >>
> >> x = FILTER y BY (t1 != t2);
> >>
> >> is zero records.  And
> >>
> >> x = FILTER y BY (t1 matches t2);
> >>
> >> is an error.  Ideal would be a StrComp(t1, t2) filter func.
> >>
> >> Is there a UDF for that?
> >>
> >> Cheers,
>
>

Re: Tuple compare

Posted by Dave Wellman <da...@tynt.com>.
Because I wasn't able to find one I tossed this UDF into the mix.

public class StrComp extends EvalFunc<Integer> {

	@Override
	public Integer exec(Tuple arg0) throws IOException {
		// should have 2 tuples.
		if (arg0.size() != 2) {
			throw new IOException("Dude where's my tuples?");
		}

		return arg0.get(0).toString().compareTo(arg0.get(1).toString());
	}
}

And the pig calls:

x = FILTER y BY StrComp(a, b) == 0;

or 

x = FILTER y BY StrComp(a, b) != 0;

The tuples a and b are chararray.  My solution "works" but a nice standard piggy bank udfs would be better.


On Aug 18, 2010, at 4:56 PM, Dmitriy Ryaboy wrote:

> Dave,
> Can you provide some sample data? A tuple can't be a chararray (but it can
> contain one), so I want to make sure I understand what the data you are
> working with looks like.
> 
> -Dmitriy
> 
> On Wed, Aug 18, 2010 at 3:29 PM, Dave Wellman <da...@tynt.com> wrote:
> 
>> All,
>> 
>> I have what should be a simple problem.  I have 2 tuples that are
>> chararrays t1, t2 and want to do a comparision.  using
>> 
>> x = FILTER y BY (t1 == t2);
>> 
>> results in zero (0) records.
>> 
>> x = FILTER y BY (t1 != t2);
>> 
>> is zero records.  And
>> 
>> x = FILTER y BY (t1 matches t2);
>> 
>> is an error.  Ideal would be a StrComp(t1, t2) filter func.
>> 
>> Is there a UDF for that?
>> 
>> Cheers,


Re: Tuple compare

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
Dave,
Can you provide some sample data? A tuple can't be a chararray (but it can
contain one), so I want to make sure I understand what the data you are
working with looks like.

-Dmitriy

On Wed, Aug 18, 2010 at 3:29 PM, Dave Wellman <da...@tynt.com> wrote:

> All,
>
> I have what should be a simple problem.  I have 2 tuples that are
> chararrays t1, t2 and want to do a comparision.  using
>
> x = FILTER y BY (t1 == t2);
>
> results in zero (0) records.
>
> x = FILTER y BY (t1 != t2);
>
> is zero records.  And
>
> x = FILTER y BY (t1 matches t2);
>
> is an error.  Ideal would be a StrComp(t1, t2) filter func.
>
> Is there a UDF for that?
>
> Cheers,