You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2011/06/01 01:04:47 UTC
[jira] [Created] (PIG-2101) Registering a Python function in a
directory other than the current working directory fails
Registering a Python function in a directory other than the current working directory fails
-------------------------------------------------------------------------------------------
Key: PIG-2101
URL: https://issues.apache.org/jira/browse/PIG-2101
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.8.1
Reporter: Alan Gates
In MapReduce mode, if the register command references a directory other than the current one, executing the Python UDF on the backend fails with: Deserialization error: could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' with arguments '[../udfs/python/production.py, production]'
I assume it is using the path on the backend to try to locate the UDF.
The script is:
{code}
register '../udfs/python/production.py' using jython as bballudfs;
players = load 'baseball' as (name:chararray, team:chararray,
pos:bag{t:(p:chararray)}, bat:map[]);
nonnull = filter players by bat#'slugging_percentage' is not null and
bat#'on_base_percentage' is not null;
calcprod = foreach nonnull generate name, bballudfs.production(
(float)bat#'slugging_percentage',
(float)bat#'on_base_percentage');
dump calcprod;
{code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2101) Registering a Python function in a
directory other than the current working directory fails
Posted by "Daniel Eklund (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045599#comment-13045599 ]
Daniel Eklund commented on PIG-2101:
------------------------------------
As per:
http://mail-archives.apache.org/mod_mbox/pig-user/201106.mbox/browser
While implicit in the title of this bug, the deserialization on the back-end is failing for any python file located in anything other than a relative directory Registered on the front end.
The functionality should support all directory references: absolute, parent-relative (i.e. '../..'), and child-relative.
> Registering a Python function in a directory other than the current working directory fails
> -------------------------------------------------------------------------------------------
>
> Key: PIG-2101
> URL: https://issues.apache.org/jira/browse/PIG-2101
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.1
> Reporter: Alan Gates
>
> In MapReduce mode, if the register command references a directory other than the current one, executing the Python UDF on the backend fails with: Deserialization error: could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' with arguments '[../udfs/python/production.py, production]'
> I assume it is using the path on the backend to try to locate the UDF.
> The script is:
> {code}
> register '../udfs/python/production.py' using jython as bballudfs;
> players = load 'baseball' as (name:chararray, team:chararray,
> pos:bag{t:(p:chararray)}, bat:map[]);
> nonnull = filter players by bat#'slugging_percentage' is not null and
> bat#'on_base_percentage' is not null;
> calcprod = foreach nonnull generate name, bballudfs.production(
> (float)bat#'slugging_percentage',
> (float)bat#'on_base_percentage');
> dump calcprod;
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2101) Registering a Python function in a
directory other than the current working directory fails
Posted by "Daniel Eklund (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043982#comment-13043982 ]
Daniel Eklund commented on PIG-2101:
------------------------------------
I can actually still get it to work with a relative directory not involving '..'
For instance
Register 'test/simple.py' as myNamespace;
where test is a subdir in the working directory. But any path with '..' fails.
Would also be nice to add something in the documentation about NOT using absolute paths.
> Registering a Python function in a directory other than the current working directory fails
> -------------------------------------------------------------------------------------------
>
> Key: PIG-2101
> URL: https://issues.apache.org/jira/browse/PIG-2101
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.8.1
> Reporter: Alan Gates
>
> In MapReduce mode, if the register command references a directory other than the current one, executing the Python UDF on the backend fails with: Deserialization error: could not instantiate 'org.apache.pig.scripting.jython.JythonFunction' with arguments '[../udfs/python/production.py, production]'
> I assume it is using the path on the backend to try to locate the UDF.
> The script is:
> {code}
> register '../udfs/python/production.py' using jython as bballudfs;
> players = load 'baseball' as (name:chararray, team:chararray,
> pos:bag{t:(p:chararray)}, bat:map[]);
> nonnull = filter players by bat#'slugging_percentage' is not null and
> bat#'on_base_percentage' is not null;
> calcprod = foreach nonnull generate name, bballudfs.production(
> (float)bat#'slugging_percentage',
> (float)bat#'on_base_percentage');
> dump calcprod;
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira