You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Ramana Venkata <ra...@ohana-media.com> on 2010/01/28 08:38:09 UTC

How to write an UDF to pass Two parameters to a UDF Filter function.........

Hi
 I want to create UDF which compares a tuple with a string value like this.

public class IsEqual extends FilterFunc {
    public Boolean exec(Tuple input,String str) throws IOException {
  // binary compary of AND is performed here;
 // if result of AND is not zero it will return true;
 return true;
}


is it possible with pig UDF ??;

Actually i want to compare two binary type data with AND operation as fallows

Table data is

ramana      1010101
krishna      1000010
venkata     1101010
......
load 'data' as name,category using PigStorate("\t");
cameusers = filter data  by  IsEqual(category,"1000010");
store cameusers;
-----------------------------------
result i am expected is ...............
krishna      1000010


Is there any other solution for this operation without UDF? can we
compare category column with binary data?


Please give responce..


thanks
ramanaiah

Re: How to write an UDF to pass Two parameters to a UDF Filter function.........

Posted by Jeff Zhang <zj...@gmail.com>.
Ramana,

Actually, there's no binary type in Pig. If you do not specify the type in
load statement, the default type is byte array.  I'm afraid you have to
write a UDF to do the binary comparison. In the UDF, you should first
convert the byte array to binary and then compare the two binaries.




On Wed, Jan 27, 2010 at 11:38 PM, Ramana Venkata <ra...@ohana-media.com>wrote:

> Hi
>  I want to create UDF which compares a tuple with a string value like this.
>
> public class IsEqual extends FilterFunc {
>    public Boolean exec(Tuple input,String str) throws IOException {
>  // binary compary of AND is performed here;
>  // if result of AND is not zero it will return true;
>  return true;
> }
>
>
> is it possible with pig UDF ??;
>
> Actually i want to compare two binary type data with AND operation as
> fallows
>
> Table data is
>
> ramana      1010101
> krishna      1000010
> venkata     1101010
> ......
> load 'data' as name,category using PigStorate("\t");
> cameusers = filter data  by  IsEqual(category,"1000010");
> store cameusers;
> -----------------------------------
> result i am expected is ...............
> krishna      1000010
>
>
> Is there any other solution for this operation without UDF? can we
> compare category column with binary data?
>
>
> Please give responce..
>
>
> thanks
> ramanaiah
>



-- 
Best Regards

Jeff Zhang

Re: How to write an UDF to pass Two parameters to a UDF Filter function.........

Posted by Mridul Muralidharan <mr...@yahoo-inc.com>.

There is an error in the basic script - which I propagated in my copy 
paste - corrected below.


Regards,
Mridul

Mridul Muralidharan wrote:
> There are two ways to handle this.
> You can pass it along as a parameter as you did in the script - though 
> note that, in your udf, it will be a tuple with first field == category, 
> second field == "1000010".
> 
> public Boolean exec(Tuple _input) throws IOException {
>    String input = (String)_input.get(0);
>    String compareStr = (String)_input.get(1);
>    ...
> }
> 
> But that might be a tad bit more expensive : since each tuple which gets 
> passed through the FilterFunc will need to have the static "1000010" 
> added to it.
> 
> 
> A better alternative is to use "define" - and initialize your IsEqual 
> class with the static param you need : by passing it through constructor.
> 
> 
> Something like :
> 
> 
> public class IsEqual extends FilterFunc {
>    private String compareStr;
> 
>    public IsEqual(String compareStr){
>      this.compareStr = compareStr;
>    }
> 

I assumed that first param is gonna by a String - which need not be the 
case (since it is not defined in the load schema), in which case "String 
input" gets replaced with appropriate datatype.

>    public Boolean exec(Tuple input) throws IOException {
>      String input = (String)_input.get(0);
>      ...
>    }
> }
> 
> 
> You use it by :
> 
> 
> define MY_EQUAL IsEqual("1000010");
> 
> load 'data' as name,category using PigStorate("\t");
> cameusers = filter data  by  MY_EQUAL(category);
> store cameusers;
> 


define MY_EQUAL IsEqual('1000010');

data = load 'data' using PigStorate('\t') as (name,category);
cameusers = filter data  by  MY_EQUAL(category);
store cameusers;



If you need the name and category to be string's, (which I suspect you 
do), then use "data = load 'data' using PigStorate('\t') as 
(name:chararray,category:chararray);"


Regards,
Mridul

> 
> 
> 
> 
> Hope this helps.
> Regards,
> Mridul
> 
> 
> Ramana Venkata wrote:
>> Hi
>>  I want to create UDF which compares a tuple with a string value like this.
>>
>> public class IsEqual extends FilterFunc {
>>     public Boolean exec(Tuple input,String str) throws IOException {
>>   // binary compary of AND is performed here;
>>  // if result of AND is not zero it will return true;
>>  return true;
>> }
>>
>>
>> is it possible with pig UDF ??;
>>
>> Actually i want to compare two binary type data with AND operation as fallows
>>
>> Table data is
>>
>> ramana      1010101
>> krishna      1000010
>> venkata     1101010
>> ......
>> load 'data' as name,category using PigStorate("\t");
>> cameusers = filter data  by  IsEqual(category,"1000010");
>> store cameusers;
>> -----------------------------------
>> result i am expected is ...............
>> krishna      1000010
>>
>>
>> Is there any other solution for this operation without UDF? can we
>> compare category column with binary data?
>>
>>
>> Please give responce..
>>
>>
>> thanks
>> ramanaiah
> 
> 


Re: How to write an UDF to pass Two parameters to a UDF Filter function.........

Posted by Mridul Muralidharan <mr...@yahoo-inc.com>.
There are two ways to handle this.
You can pass it along as a parameter as you did in the script - though 
note that, in your udf, it will be a tuple with first field == category, 
second field == "1000010".

public Boolean exec(Tuple _input) throws IOException {
   String input = (String)_input.get(0);
   String compareStr = (String)_input.get(1);
   ...
}

But that might be a tad bit more expensive : since each tuple which gets 
passed through the FilterFunc will need to have the static "1000010" 
added to it.


A better alternative is to use "define" - and initialize your IsEqual 
class with the static param you need : by passing it through constructor.


Something like :


public class IsEqual extends FilterFunc {
   private String compareStr;

   public IsEqual(String compareStr){
     this.compareStr = compareStr;
   }

   public Boolean exec(Tuple input) throws IOException {
     String input = (String)_input.get(0);
     ...
   }
}


You use it by :


define MY_EQUAL IsEqual("1000010");

load 'data' as name,category using PigStorate("\t");
cameusers = filter data  by  MY_EQUAL(category);
store cameusers;





Hope this helps.
Regards,
Mridul


Ramana Venkata wrote:
> Hi
>  I want to create UDF which compares a tuple with a string value like this.
> 
> public class IsEqual extends FilterFunc {
>     public Boolean exec(Tuple input,String str) throws IOException {
>   // binary compary of AND is performed here;
>  // if result of AND is not zero it will return true;
>  return true;
> }
> 
> 
> is it possible with pig UDF ??;
> 
> Actually i want to compare two binary type data with AND operation as fallows
> 
> Table data is
> 
> ramana      1010101
> krishna      1000010
> venkata     1101010
> ......
> load 'data' as name,category using PigStorate("\t");
> cameusers = filter data  by  IsEqual(category,"1000010");
> store cameusers;
> -----------------------------------
> result i am expected is ...............
> krishna      1000010
> 
> 
> Is there any other solution for this operation without UDF? can we
> compare category column with binary data?
> 
> 
> Please give responce..
> 
> 
> thanks
> ramanaiah