You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Васяйчев Сергей <va...@mail.ru> on 2010/08/06 14:29:58 UTC

UDF. Change outputSchema from Evaluate

Hello!

I created custom UDF and I need to have different outputSchema columns(in one
case - 2 chararray fields, in the other - 3 chararray fields for
output) depends on UDF input parameter(mode_1 = 'one'):
..FLATTEN(myUDF('int, char', 'one'))

I overrided
@Override
        public Schema getOutputSchema(Schema input){
        Schema schema = new Schema();
         // check mode_1 parameter 'one'- it's null because it will be set
         // in evaluate func. only
         if( mode_1!= null && mode_1.equal("one")){
             schema.add(new FieldSchema("type", DataType.CHARARRAY));
             schema.add(new FieldSchema("text", DataType.CHARARRAY));
             return schema;
           }
           // default
           schema.add(new FieldSchema("type", DataType.CHARARRAY));
           schema.add(new FieldSchema("text", DataType.CHARARRAY));
           schema.add(new FieldSchema("age", DataType.CHARARRAY));
           return schema;
         }
        }
but this function is called before
@Override
        public DataBag evaluate(Tuple input)
         {
           // get UDF mode parameter mode_1
           Object xmode = input.get(1);
           mode_1 = xmode.toString();
           ...
         }
where I can check needed parameter 'one' - so I can't change schema defined
in getOutputSchema(...) function initially.

Is it a way to change OutputSchema from evaluate(...) ?

I tried to create a constructor
 public  myUDF(String param, String mode )
  {
  }
but it was never called. Only myUDF() - constructor without parameters
was called.

Class' static variable for mode down't work also.
I use Pig 0.3.0.

-- 
Best regards,
  Serg.



Re: UDF. Change outputSchema from Evaluate

Posted by Mridul Muralidharan <mr...@yahoo-inc.com>.
If I understood your problem right, you can use define to pass 
parameters to constructor and then use that (after populating it into a 
instance field).


-- note, only String's are accepted as parameters !
define MY_UDF org.me.udfp.MyUDF('param1', 'param2');
--- This will call the constructor public MyUDF(String str1, String 
str2){...}


....
B = FOREACH A GENERATE MY_UDF($0);
....



- Mridul



On Friday 06 August 2010 05:59 PM, Васяйчев Сергей wrote:
> Hello!
>
> I created custom UDF and I need to have different outputSchema columns(in one
> case - 2 chararray fields, in the other - 3 chararray fields for
> output) depends on UDF input parameter(mode_1 = 'one'):
> ..FLATTEN(myUDF('int, char', 'one'))
>
> I overrided
> @Override
>          public Schema getOutputSchema(Schema input){
>          Schema schema = new Schema();
>           // check mode_1 parameter 'one'- it's null because it will be set
>           // in evaluate func. only
>           if( mode_1!= null&&  mode_1.equal("one")){
>               schema.add(new FieldSchema("type", DataType.CHARARRAY));
>               schema.add(new FieldSchema("text", DataType.CHARARRAY));
>               return schema;
>             }
>             // default
>             schema.add(new FieldSchema("type", DataType.CHARARRAY));
>             schema.add(new FieldSchema("text", DataType.CHARARRAY));
>             schema.add(new FieldSchema("age", DataType.CHARARRAY));
>             return schema;
>           }
>          }
> but this function is called before
> @Override
>          public DataBag evaluate(Tuple input)
>           {
>             // get UDF mode parameter mode_1
>             Object xmode = input.get(1);
>             mode_1 = xmode.toString();
>             ...
>           }
> where I can check needed parameter 'one' - so I can't change schema defined
> in getOutputSchema(...) function initially.
>
> Is it a way to change OutputSchema from evaluate(...) ?
>
> I tried to create a constructor
>   public  myUDF(String param, String mode )
>    {
>    }
> but it was never called. Only myUDF() - constructor without parameters
> was called.
>
> Class' static variable for mode down't work also.
> I use Pig 0.3.0.
>