You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Ramana Venkata <ra...@ohana-media.com> on 2010/01/28 08:38:09 UTC
How to write an UDF to pass Two parameters to a UDF Filter
function.........
Hi
I want to create UDF which compares a tuple with a string value like this.
public class IsEqual extends FilterFunc {
public Boolean exec(Tuple input,String str) throws IOException {
// binary compary of AND is performed here;
// if result of AND is not zero it will return true;
return true;
}
is it possible with pig UDF ??;
Actually i want to compare two binary type data with AND operation as fallows
Table data is
ramana 1010101
krishna 1000010
venkata 1101010
......
load 'data' as name,category using PigStorate("\t");
cameusers = filter data by IsEqual(category,"1000010");
store cameusers;
-----------------------------------
result i am expected is ...............
krishna 1000010
Is there any other solution for this operation without UDF? can we
compare category column with binary data?
Please give responce..
thanks
ramanaiah
Re: How to write an UDF to pass Two parameters to a UDF Filter
function.........
Posted by Jeff Zhang <zj...@gmail.com>.
Ramana,
Actually, there's no binary type in Pig. If you do not specify the type in
load statement, the default type is byte array. I'm afraid you have to
write a UDF to do the binary comparison. In the UDF, you should first
convert the byte array to binary and then compare the two binaries.
On Wed, Jan 27, 2010 at 11:38 PM, Ramana Venkata <ra...@ohana-media.com>wrote:
> Hi
> I want to create UDF which compares a tuple with a string value like this.
>
> public class IsEqual extends FilterFunc {
> public Boolean exec(Tuple input,String str) throws IOException {
> // binary compary of AND is performed here;
> // if result of AND is not zero it will return true;
> return true;
> }
>
>
> is it possible with pig UDF ??;
>
> Actually i want to compare two binary type data with AND operation as
> fallows
>
> Table data is
>
> ramana 1010101
> krishna 1000010
> venkata 1101010
> ......
> load 'data' as name,category using PigStorate("\t");
> cameusers = filter data by IsEqual(category,"1000010");
> store cameusers;
> -----------------------------------
> result i am expected is ...............
> krishna 1000010
>
>
> Is there any other solution for this operation without UDF? can we
> compare category column with binary data?
>
>
> Please give responce..
>
>
> thanks
> ramanaiah
>
--
Best Regards
Jeff Zhang
Re: How to write an UDF to pass Two parameters to a UDF Filter function.........
Posted by Mridul Muralidharan <mr...@yahoo-inc.com>.
There is an error in the basic script - which I propagated in my copy
paste - corrected below.
Regards,
Mridul
Mridul Muralidharan wrote:
> There are two ways to handle this.
> You can pass it along as a parameter as you did in the script - though
> note that, in your udf, it will be a tuple with first field == category,
> second field == "1000010".
>
> public Boolean exec(Tuple _input) throws IOException {
> String input = (String)_input.get(0);
> String compareStr = (String)_input.get(1);
> ...
> }
>
> But that might be a tad bit more expensive : since each tuple which gets
> passed through the FilterFunc will need to have the static "1000010"
> added to it.
>
>
> A better alternative is to use "define" - and initialize your IsEqual
> class with the static param you need : by passing it through constructor.
>
>
> Something like :
>
>
> public class IsEqual extends FilterFunc {
> private String compareStr;
>
> public IsEqual(String compareStr){
> this.compareStr = compareStr;
> }
>
I assumed that first param is gonna by a String - which need not be the
case (since it is not defined in the load schema), in which case "String
input" gets replaced with appropriate datatype.
> public Boolean exec(Tuple input) throws IOException {
> String input = (String)_input.get(0);
> ...
> }
> }
>
>
> You use it by :
>
>
> define MY_EQUAL IsEqual("1000010");
>
> load 'data' as name,category using PigStorate("\t");
> cameusers = filter data by MY_EQUAL(category);
> store cameusers;
>
define MY_EQUAL IsEqual('1000010');
data = load 'data' using PigStorate('\t') as (name,category);
cameusers = filter data by MY_EQUAL(category);
store cameusers;
If you need the name and category to be string's, (which I suspect you
do), then use "data = load 'data' using PigStorate('\t') as
(name:chararray,category:chararray);"
Regards,
Mridul
>
>
>
>
> Hope this helps.
> Regards,
> Mridul
>
>
> Ramana Venkata wrote:
>> Hi
>> I want to create UDF which compares a tuple with a string value like this.
>>
>> public class IsEqual extends FilterFunc {
>> public Boolean exec(Tuple input,String str) throws IOException {
>> // binary compary of AND is performed here;
>> // if result of AND is not zero it will return true;
>> return true;
>> }
>>
>>
>> is it possible with pig UDF ??;
>>
>> Actually i want to compare two binary type data with AND operation as fallows
>>
>> Table data is
>>
>> ramana 1010101
>> krishna 1000010
>> venkata 1101010
>> ......
>> load 'data' as name,category using PigStorate("\t");
>> cameusers = filter data by IsEqual(category,"1000010");
>> store cameusers;
>> -----------------------------------
>> result i am expected is ...............
>> krishna 1000010
>>
>>
>> Is there any other solution for this operation without UDF? can we
>> compare category column with binary data?
>>
>>
>> Please give responce..
>>
>>
>> thanks
>> ramanaiah
>
>
Re: How to write an UDF to pass Two parameters to a UDF Filter function.........
Posted by Mridul Muralidharan <mr...@yahoo-inc.com>.
There are two ways to handle this.
You can pass it along as a parameter as you did in the script - though
note that, in your udf, it will be a tuple with first field == category,
second field == "1000010".
public Boolean exec(Tuple _input) throws IOException {
String input = (String)_input.get(0);
String compareStr = (String)_input.get(1);
...
}
But that might be a tad bit more expensive : since each tuple which gets
passed through the FilterFunc will need to have the static "1000010"
added to it.
A better alternative is to use "define" - and initialize your IsEqual
class with the static param you need : by passing it through constructor.
Something like :
public class IsEqual extends FilterFunc {
private String compareStr;
public IsEqual(String compareStr){
this.compareStr = compareStr;
}
public Boolean exec(Tuple input) throws IOException {
String input = (String)_input.get(0);
...
}
}
You use it by :
define MY_EQUAL IsEqual("1000010");
load 'data' as name,category using PigStorate("\t");
cameusers = filter data by MY_EQUAL(category);
store cameusers;
Hope this helps.
Regards,
Mridul
Ramana Venkata wrote:
> Hi
> I want to create UDF which compares a tuple with a string value like this.
>
> public class IsEqual extends FilterFunc {
> public Boolean exec(Tuple input,String str) throws IOException {
> // binary compary of AND is performed here;
> // if result of AND is not zero it will return true;
> return true;
> }
>
>
> is it possible with pig UDF ??;
>
> Actually i want to compare two binary type data with AND operation as fallows
>
> Table data is
>
> ramana 1010101
> krishna 1000010
> venkata 1101010
> ......
> load 'data' as name,category using PigStorate("\t");
> cameusers = filter data by IsEqual(category,"1000010");
> store cameusers;
> -----------------------------------
> result i am expected is ...............
> krishna 1000010
>
>
> Is there any other solution for this operation without UDF? can we
> compare category column with binary data?
>
>
> Please give responce..
>
>
> thanks
> ramanaiah