You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Xiaomeng Wan <sh...@gmail.com> on 2011/04/01 00:07:02 UTC

CDH3 fail python udf

Hi,
We recently updated our hadoop from CDH2 to CDH3b4, and had problems
using some old python udfs. Runing in local mode still works, but in
hadoop mode, it gives errors like "could not instantiate
'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
Anyone see similar error with python udf on this hadoop distribution?
We are using pig 0.8.0. Thanks!

Regards
Shawn

Re: CDH3 fail python udf

Posted by Xiaomeng Wan <sh...@gmail.com>.
Thanks! Aniket
I will look into it.

Regards,
Shawn

On Fri, Apr 1, 2011 at 5:10 PM, Aniket Mokashi <am...@andrew.cmu.edu> wrote:
> Hi Shawn,
>
> I think this is more of CDH packaging problem than Pig problem. I suspect
> this is related to Java versions of jython and other components.
>
> You may look into
> https://docs.cloudera.com/download/attachments/8784980/CDH3b3_Installation_Guide.pdf?version=1&modificationDate=1300229469101
> for more details.
>
> Thanks,
> Aniket
>
> On Fri, April 1, 2011 4:42 pm, Xiaomeng Wan wrote:
>> Hi Aniket,
>>
>>
>> Here is the stacktrace of the exception.
>>
>>
>> java.io.IOException: Deserialization error: could not instantiate
>> 'org.apache.pig.scripting.jython.JythonFunction' with arguments
>> '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
>> at
>> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.ja
>> va:55)
>> at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.s
>> etup(PigMapBase.java:151)
>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at
>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) at
>> org.apache.hadoop.mapred.Child$4.run(Child.java:251)
>> at java.security.AccessController.doPrivileged(Native Method) at
>> javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
>> .java:1115)
>> at org.apache.hadoop.mapred.Child.main(Child.java:245) Caused by:
>> java.lang.RuntimeException: could not instantiate
>> 'org.apache.pig.scripting.jython.JythonFunction' with arguments
>> '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
>> at
>> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:50
>> 2)
>> at
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
>> rators.POUserFunc.instantiateFunc(POUserFunc.java:109)
>> at
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
>> rators.POUserFunc.readObject(POUserFunc.java:451)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
>> :39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>> mpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597) at
>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>> at java.util.ArrayList.readObject(ArrayList.java:593) at
>> sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>> mpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597) at
>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>> at java.util.HashMap.readObject(HashMap.java:1030) at
>> sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>> mpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597) at
>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>> at java.util.ArrayList.readObject(ArrayList.java:593) at
>> sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>> mpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597) at
>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>> at java.util.ArrayList.readObject(ArrayList.java:593) at
>> sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>> mpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597) at
>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>> at java.util.HashMap.readObject(HashMap.java:1030) at
>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
>> :39)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
>> mpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597) at
>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
>> at
>> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.ja
>> va:53)
>> ... 9 more
>> Caused by: java.lang.reflect.InvocationTargetException
>> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAc
>> cessorImpl.java:39)
>> at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConst
>> ructorAccessorImpl.java:27)
>> at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
>> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:47
>> 0)
>> ... 87 more
>> Caused by: java.lang.IllegalStateException: Could not initialize:
>> /home/shawn/TESS/code/mypyudfs.py
>> at
>> org.apache.pig.scripting.jython.JythonFunction.<init>(JythonFunction.java
>> :86)
>> ... 92 more
>> 2011-04-01 14:31:40,977 INFO org.apache.hadoop.mapred.Task: Runnning
>> cleanup for the task
>>
>> Thanks!
>>
>>
>> Regards,
>> Shawn
>>
>>
>> On Fri, Apr 1, 2011 at 2:24 PM, Aniket Mokashi <am...@andrew.cmu.edu>
>> wrote:
>>
>>> Hi Shawn,
>>>
>>>
>>> Every time we throw an Exception with 'could not instantiate ..' error
>>> message, we also pass down the real exception instance, this might be
>>> able to point to the reason why we fail in this scenario. Can you provide
>>> details of your exception message from the log?
>>>
>>> The way this works is, when you register the myudf.py script we
>>> register all the function names inside script to pig and when we use
>>> these functions, we parse and construct them with JythonFunction
>>> constructor.
>>>
>>> Thanks,
>>> Aniket
>>>
>>>
>>> On Fri, April 1, 2011 12:06 pm, Xiaomeng Wan wrote:
>>>
>>>> Hi Aniket,
>>>>
>>>>
>>>>
>>>> We put both jython.jar and myudf.py in classpath and also register
>>>> jython.jar in our pig script. It worked well before the upgrading,
>>>> only failed after.
>>>>
>>>> Regards,
>>>> Shawn
>>>>
>>>>
>>>>
>>>> On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi
>>>> <am...@andrew.cmu.edu>
>>>> wrote:
>>>>
>>>>
>>>>> I think this might be because when you start in hadoop mode, your
>>>>> classpath configuration does not have jython.jar. Can you put that
>>>>> explicitly in classpath and check it out?
>>>>>
>>>>> Thanks,
>>>>> Aniket
>>>>>
>>>>>
>>>>>
>>>>> On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:
>>>>>
>>>>>
>>>>>> Hi,
>>>>>> We recently updated our hadoop from CDH2 to CDH3b4, and had
>>>>>> problems using some old python udfs. Runing in local mode still
>>>>>> works, but in hadoop mode, it gives errors like "could not
>>>>>> instantiate 'org.apache.pig.scripting.jython.JythonFunction' with
>>>>>> arguments...". Anyone see similar error with python udf on this
>>>>>> hadoop distribution? We are using pig 0.8.0. Thanks!
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Regards
>>>>>> Shawn
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>

Re: CDH3 fail python udf

Posted by Aniket Mokashi <am...@andrew.cmu.edu>.
Hi Shawn,

I think this is more of CDH packaging problem than Pig problem. I suspect
this is related to Java versions of jython and other components.

You may look into
https://docs.cloudera.com/download/attachments/8784980/CDH3b3_Installation_Guide.pdf?version=1&modificationDate=1300229469101
for more details.

Thanks,
Aniket

On Fri, April 1, 2011 4:42 pm, Xiaomeng Wan wrote:
> Hi Aniket,
>
>
> Here is the stacktrace of the exception.
>
>
> java.io.IOException: Deserialization error: could not instantiate
> 'org.apache.pig.scripting.jython.JythonFunction' with arguments
> '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
> at
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.ja
> va:55)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.s
> etup(PigMapBase.java:151)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) at
> org.apache.hadoop.mapred.Child$4.run(Child.java:251)
> at java.security.AccessController.doPrivileged(Native Method) at
> javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation
> .java:1115)
> at org.apache.hadoop.mapred.Child.main(Child.java:245) Caused by:
> java.lang.RuntimeException: could not instantiate
> 'org.apache.pig.scripting.jython.JythonFunction' with arguments
> '[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
> at
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:50
> 2)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
> rators.POUserFunc.instantiateFunc(POUserFunc.java:109)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOpe
> rators.POUserFunc.readObject(POUserFunc.java:451)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
> :39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
> mpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
> at java.util.ArrayList.readObject(ArrayList.java:593) at
> sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
> mpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
> at java.util.HashMap.readObject(HashMap.java:1030) at
> sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
> mpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
> at java.util.ArrayList.readObject(ArrayList.java:593) at
> sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
> mpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
> at java.util.ArrayList.readObject(ArrayList.java:593) at
> sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
> mpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
> at java.util.HashMap.readObject(HashMap.java:1030) at
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java
> :39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI
> mpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597) at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
> at
> org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.ja
> va:53)
> ... 9 more
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAc
> cessorImpl.java:39)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConst
> ructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at
> org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:47
> 0)
> ... 87 more
> Caused by: java.lang.IllegalStateException: Could not initialize:
> /home/shawn/TESS/code/mypyudfs.py
> at
> org.apache.pig.scripting.jython.JythonFunction.<init>(JythonFunction.java
> :86)
> ... 92 more
> 2011-04-01 14:31:40,977 INFO org.apache.hadoop.mapred.Task: Runnning
> cleanup for the task
>
> Thanks!
>
>
> Regards,
> Shawn
>
>
> On Fri, Apr 1, 2011 at 2:24 PM, Aniket Mokashi <am...@andrew.cmu.edu>
> wrote:
>
>> Hi Shawn,
>>
>>
>> Every time we throw an Exception with 'could not instantiate ..' error
>> message, we also pass down the real exception instance, this might be
>> able to point to the reason why we fail in this scenario. Can you provide
>> details of your exception message from the log?
>>
>> The way this works is, when you register the myudf.py script we
>> register all the function names inside script to pig and when we use
>> these functions, we parse and construct them with JythonFunction
>> constructor.
>>
>> Thanks,
>> Aniket
>>
>>
>> On Fri, April 1, 2011 12:06 pm, Xiaomeng Wan wrote:
>>
>>> Hi Aniket,
>>>
>>>
>>>
>>> We put both jython.jar and myudf.py in classpath and also register
>>> jython.jar in our pig script. It worked well before the upgrading,
>>> only failed after.
>>>
>>> Regards,
>>> Shawn
>>>
>>>
>>>
>>> On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi
>>> <am...@andrew.cmu.edu>
>>> wrote:
>>>
>>>
>>>> I think this might be because when you start in hadoop mode, your
>>>> classpath configuration does not have jython.jar. Can you put that
>>>> explicitly in classpath and check it out?
>>>>
>>>> Thanks,
>>>> Aniket
>>>>
>>>>
>>>>
>>>> On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:
>>>>
>>>>
>>>>> Hi,
>>>>> We recently updated our hadoop from CDH2 to CDH3b4, and had
>>>>> problems using some old python udfs. Runing in local mode still
>>>>> works, but in hadoop mode, it gives errors like "could not
>>>>> instantiate 'org.apache.pig.scripting.jython.JythonFunction' with
>>>>> arguments...". Anyone see similar error with python udf on this
>>>>> hadoop distribution? We are using pig 0.8.0. Thanks!
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Regards
>>>>> Shawn
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>



Re: CDH3 fail python udf

Posted by Xiaomeng Wan <sh...@gmail.com>.
Hi Aniket,

Here is the stacktrace of the exception.

java.io.IOException: Deserialization error: could not instantiate
'org.apache.pig.scripting.jython.JythonFunction' with arguments
'[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
	at org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:55)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.setup(PigMapBase.java:151)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:251)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
	at org.apache.hadoop.mapred.Child.main(Child.java:245)
Caused by: java.lang.RuntimeException: could not instantiate
'org.apache.pig.scripting.jython.JythonFunction' with arguments
'[/home/shawn/TESS/code/mypyudfs.py, isStopWord]'
	at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:502)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.instantiateFunc(POUserFunc.java:109)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.readObject(POUserFunc.java:451)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
	at java.util.ArrayList.readObject(ArrayList.java:593)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
	at java.util.HashMap.readObject(HashMap.java:1030)
	at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
	at java.util.ArrayList.readObject(ArrayList.java:593)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
	at java.util.ArrayList.readObject(ArrayList.java:593)
	at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
	at java.util.HashMap.readObject(HashMap.java:1030)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:974)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1849)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
	at org.apache.pig.impl.util.ObjectSerializer.deserialize(ObjectSerializer.java:53)
	... 9 more
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
	at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:470)
	... 87 more
Caused by: java.lang.IllegalStateException: Could not initialize:
/home/shawn/TESS/code/mypyudfs.py
	at org.apache.pig.scripting.jython.JythonFunction.<init>(JythonFunction.java:86)
	... 92 more
2011-04-01 14:31:40,977 INFO org.apache.hadoop.mapred.Task: Runnning
cleanup for the task

Thanks!

Regards,
Shawn

On Fri, Apr 1, 2011 at 2:24 PM, Aniket Mokashi <am...@andrew.cmu.edu> wrote:
> Hi Shawn,
>
> Every time we throw an Exception with 'could not instantiate ..' error
> message, we also pass down the real exception instance, this might be able
> to point to the reason why we fail in this scenario.
> Can you provide details of your exception message from the log?
>
> The way this works is, when you register the myudf.py script we register
> all the function names inside script to pig and when we use these
> functions, we parse and construct them with JythonFunction constructor.
>
> Thanks,
> Aniket
>
> On Fri, April 1, 2011 12:06 pm, Xiaomeng Wan wrote:
>> Hi Aniket,
>>
>>
>> We put both jython.jar and myudf.py in classpath and also register
>> jython.jar in our pig script. It worked well before the upgrading, only
>> failed after.
>>
>> Regards,
>> Shawn
>>
>>
>> On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi <am...@andrew.cmu.edu>
>> wrote:
>>
>>> I think this might be because when you start in hadoop mode, your
>>> classpath configuration does not have jython.jar. Can you put that
>>> explicitly in classpath and check it out?
>>>
>>> Thanks,
>>> Aniket
>>>
>>>
>>> On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:
>>>
>>>> Hi,
>>>> We recently updated our hadoop from CDH2 to CDH3b4, and had problems
>>>> using some old python udfs. Runing in local mode still works, but in
>>>> hadoop mode, it gives errors like "could not instantiate
>>>> 'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
>>>> Anyone see similar error with python udf on this hadoop distribution?
>>>> We are using pig 0.8.0. Thanks!
>>>>
>>>>
>>>>
>>>> Regards
>>>> Shawn
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
>
>

Re: CDH3 fail python udf

Posted by Aniket Mokashi <am...@andrew.cmu.edu>.
Hi Shawn,

Every time we throw an Exception with 'could not instantiate ..' error
message, we also pass down the real exception instance, this might be able
to point to the reason why we fail in this scenario.
Can you provide details of your exception message from the log?

The way this works is, when you register the myudf.py script we register
all the function names inside script to pig and when we use these
functions, we parse and construct them with JythonFunction constructor.

Thanks,
Aniket

On Fri, April 1, 2011 12:06 pm, Xiaomeng Wan wrote:
> Hi Aniket,
>
>
> We put both jython.jar and myudf.py in classpath and also register
> jython.jar in our pig script. It worked well before the upgrading, only
> failed after.
>
> Regards,
> Shawn
>
>
> On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi <am...@andrew.cmu.edu>
> wrote:
>
>> I think this might be because when you start in hadoop mode, your
>> classpath configuration does not have jython.jar. Can you put that
>> explicitly in classpath and check it out?
>>
>> Thanks,
>> Aniket
>>
>>
>> On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:
>>
>>> Hi,
>>> We recently updated our hadoop from CDH2 to CDH3b4, and had problems
>>> using some old python udfs. Runing in local mode still works, but in
>>> hadoop mode, it gives errors like "could not instantiate
>>> 'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
>>> Anyone see similar error with python udf on this hadoop distribution?
>>> We are using pig 0.8.0. Thanks!
>>>
>>>
>>>
>>> Regards
>>> Shawn
>>>
>>>
>>>
>>>
>>
>>
>>
>
>



Re: CDH3 fail python udf

Posted by Xiaomeng Wan <sh...@gmail.com>.
Hi Aniket,

We put both jython.jar and myudf.py in classpath and also register
jython.jar in our pig script. It worked well before the upgrading,
only failed after.

Regards,
Shawn

On Thu, Mar 31, 2011 at 4:38 PM, Aniket Mokashi <am...@andrew.cmu.edu> wrote:
> I think this might be because when you start in hadoop mode, your
> classpath configuration does not have jython.jar. Can you put that
> explicitly in classpath and check it out?
>
> Thanks,
> Aniket
>
> On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:
>> Hi,
>> We recently updated our hadoop from CDH2 to CDH3b4, and had problems
>> using some old python udfs. Runing in local mode still works, but in hadoop
>> mode, it gives errors like "could not instantiate
>> 'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
>> Anyone see similar error with python udf on this hadoop distribution?
>> We are using pig 0.8.0. Thanks!
>>
>>
>> Regards
>> Shawn
>>
>>
>>
>
>
>

Re: CDH3 fail python udf

Posted by Aniket Mokashi <am...@andrew.cmu.edu>.
I think this might be because when you start in hadoop mode, your
classpath configuration does not have jython.jar. Can you put that
explicitly in classpath and check it out?

Thanks,
Aniket

On Thu, March 31, 2011 6:07 pm, Xiaomeng Wan wrote:
> Hi,
> We recently updated our hadoop from CDH2 to CDH3b4, and had problems
> using some old python udfs. Runing in local mode still works, but in hadoop
> mode, it gives errors like "could not instantiate
> 'org.apache.pig.scripting.jython.JythonFunction' with arguments...".
> Anyone see similar error with python udf on this hadoop distribution?
> We are using pig 0.8.0. Thanks!
>
>
> Regards
> Shawn
>
>
>