You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Васяйчев Сергей <va...@mail.ru> on 2010/08/06 14:29:58 UTC
UDF. Change outputSchema from Evaluate
Hello!
I created custom UDF and I need to have different outputSchema columns(in one
case - 2 chararray fields, in the other - 3 chararray fields for
output) depends on UDF input parameter(mode_1 = 'one'):
..FLATTEN(myUDF('int, char', 'one'))
I overrided
@Override
public Schema getOutputSchema(Schema input){
Schema schema = new Schema();
// check mode_1 parameter 'one'- it's null because it will be set
// in evaluate func. only
if( mode_1!= null && mode_1.equal("one")){
schema.add(new FieldSchema("type", DataType.CHARARRAY));
schema.add(new FieldSchema("text", DataType.CHARARRAY));
return schema;
}
// default
schema.add(new FieldSchema("type", DataType.CHARARRAY));
schema.add(new FieldSchema("text", DataType.CHARARRAY));
schema.add(new FieldSchema("age", DataType.CHARARRAY));
return schema;
}
}
but this function is called before
@Override
public DataBag evaluate(Tuple input)
{
// get UDF mode parameter mode_1
Object xmode = input.get(1);
mode_1 = xmode.toString();
...
}
where I can check needed parameter 'one' - so I can't change schema defined
in getOutputSchema(...) function initially.
Is it a way to change OutputSchema from evaluate(...) ?
I tried to create a constructor
public myUDF(String param, String mode )
{
}
but it was never called. Only myUDF() - constructor without parameters
was called.
Class' static variable for mode down't work also.
I use Pig 0.3.0.
--
Best regards,
Serg.
Re: UDF. Change outputSchema from Evaluate
Posted by Mridul Muralidharan <mr...@yahoo-inc.com>.
If I understood your problem right, you can use define to pass
parameters to constructor and then use that (after populating it into a
instance field).
-- note, only String's are accepted as parameters !
define MY_UDF org.me.udfp.MyUDF('param1', 'param2');
--- This will call the constructor public MyUDF(String str1, String
str2){...}
....
B = FOREACH A GENERATE MY_UDF($0);
....
- Mridul
On Friday 06 August 2010 05:59 PM, Васяйчев Сергей wrote:
> Hello!
>
> I created custom UDF and I need to have different outputSchema columns(in one
> case - 2 chararray fields, in the other - 3 chararray fields for
> output) depends on UDF input parameter(mode_1 = 'one'):
> ..FLATTEN(myUDF('int, char', 'one'))
>
> I overrided
> @Override
> public Schema getOutputSchema(Schema input){
> Schema schema = new Schema();
> // check mode_1 parameter 'one'- it's null because it will be set
> // in evaluate func. only
> if( mode_1!= null&& mode_1.equal("one")){
> schema.add(new FieldSchema("type", DataType.CHARARRAY));
> schema.add(new FieldSchema("text", DataType.CHARARRAY));
> return schema;
> }
> // default
> schema.add(new FieldSchema("type", DataType.CHARARRAY));
> schema.add(new FieldSchema("text", DataType.CHARARRAY));
> schema.add(new FieldSchema("age", DataType.CHARARRAY));
> return schema;
> }
> }
> but this function is called before
> @Override
> public DataBag evaluate(Tuple input)
> {
> // get UDF mode parameter mode_1
> Object xmode = input.get(1);
> mode_1 = xmode.toString();
> ...
> }
> where I can check needed parameter 'one' - so I can't change schema defined
> in getOutputSchema(...) function initially.
>
> Is it a way to change OutputSchema from evaluate(...) ?
>
> I tried to create a constructor
> public myUDF(String param, String mode )
> {
> }
> but it was never called. Only myUDF() - constructor without parameters
> was called.
>
> Class' static variable for mode down't work also.
> I use Pig 0.3.0.
>