You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2012/06/05 01:07:23 UTC

[jira] [Created] (PIG-2739) PyList should map to Bag automatically in Jython

Daniel Dai created PIG-2739:
-------------------------------

             Summary: PyList should map to Bag automatically in Jython
                 Key: PIG-2739
                 URL: https://issues.apache.org/jira/browse/PIG-2739
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.10.0, 0.11
            Reporter: Daniel Dai
            Assignee: Daniel Dai


The following script does not work:
<code>
register 'util.py' using jython as util;
A = load '1.txt' as (sentence:chararray);
B = foreach A generate flatten(util.tokenize(sentence));
dump B;
<code>

util.py
<code>
outputSchema("words:{(word:chararray)}")
def tokenize(sentence):
    return sentence.split(' ')
<code>

Error message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function]
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.io.IOException: Error executing function
	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
	... 11 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Cannot convert jython type (org.python.core.PyList) to pig datatype java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
	... 12 more
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
	... 13 more

The problem is Pig expects a tuple inside a list, which is unintuitive in Python.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2739) PyList should map to Bag automatically in Jython

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-2739:
----------------------------

    Attachment: PIG-2739-0.patch
    
> PyList should map to Bag automatically in Jython
> ------------------------------------------------
>
>                 Key: PIG-2739
>                 URL: https://issues.apache.org/jira/browse/PIG-2739
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0, 0.11
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>         Attachments: PIG-2739-0.patch
>
>
> The following script does not work:
> {code}
> register 'util.py' using jython as util;
> A = load '1.txt' as (sentence:chararray);
> B = foreach A generate flatten(util.tokenize(sentence));
> dump B;
> {code}
> util.py
> {code}
> outputSchema("words:{(word:chararray)}")
> def tokenize(sentence):
>     return sentence.split(' ')
> {code}
> Error message:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function]
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: java.io.IOException: Error executing function
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
> 	... 11 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Cannot convert jython type (org.python.core.PyList) to pig datatype java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
> 	... 12 more
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
> 	... 13 more
> The problem is Pig expects a tuple inside a list, which is unintuitive in Python.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2739) PyList should map to Bag automatically in Jython

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-2739:
----------------------------

    Attachment: PIG-2739-1.patch

Add test case, and change the comment as Julien point out.
                
> PyList should map to Bag automatically in Jython
> ------------------------------------------------
>
>                 Key: PIG-2739
>                 URL: https://issues.apache.org/jira/browse/PIG-2739
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0, 0.11
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>         Attachments: PIG-2739-0.patch, PIG-2739-1.patch
>
>
> The following script does not work:
> {code}
> register 'util.py' using jython as util;
> A = load '1.txt' as (sentence:chararray);
> B = foreach A generate flatten(util.tokenize(sentence));
> dump B;
> {code}
> util.py
> {code}
> outputSchema("words:{(word:chararray)}")
> def tokenize(sentence):
>     return sentence.split(' ')
> {code}
> Error message:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function]
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: java.io.IOException: Error executing function
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
> 	... 11 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Cannot convert jython type (org.python.core.PyList) to pig datatype java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
> 	... 12 more
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
> 	... 13 more
> The problem is Pig expects a tuple inside a list, which is unintuitive in Python.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (PIG-2739) PyList should map to Bag automatically in Jython

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-2739.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.10.1
                   0.11
     Hadoop Flags: Reviewed

Patch committed to both 0.10/trunk.
                
> PyList should map to Bag automatically in Jython
> ------------------------------------------------
>
>                 Key: PIG-2739
>                 URL: https://issues.apache.org/jira/browse/PIG-2739
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0, 0.11
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>             Fix For: 0.11, 0.10.1
>
>         Attachments: PIG-2739-0.patch, PIG-2739-1.patch
>
>
> The following script does not work:
> {code}
> register 'util.py' using jython as util;
> A = load '1.txt' as (sentence:chararray);
> B = foreach A generate flatten(util.tokenize(sentence));
> dump B;
> {code}
> util.py
> {code}
> outputSchema("words:{(word:chararray)}")
> def tokenize(sentence):
>     return sentence.split(' ')
> {code}
> Error message:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function]
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: java.io.IOException: Error executing function
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
> 	... 11 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Cannot convert jython type (org.python.core.PyList) to pig datatype java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
> 	... 12 more
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
> 	... 13 more
> The problem is Pig expects a tuple inside a list, which is unintuitive in Python.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2739) PyList should map to Bag automatically in Jython

Posted by "Julien Le Dem (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289793#comment-13289793 ] 

Julien Le Dem commented on PIG-2739:
------------------------------------

This looks good to me.
Don't forget to update the comment right above the changed code:
{noformat}
// In jython, list need not be a bag of tuples, as it is in case of pig
// So we fail with cast exception if we dont find tuples inside bag
// This is consistent with java udf (bag should be filled with tuples)
{noformat}
                
> PyList should map to Bag automatically in Jython
> ------------------------------------------------
>
>                 Key: PIG-2739
>                 URL: https://issues.apache.org/jira/browse/PIG-2739
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0, 0.11
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>         Attachments: PIG-2739-0.patch
>
>
> The following script does not work:
> {code}
> register 'util.py' using jython as util;
> A = load '1.txt' as (sentence:chararray);
> B = foreach A generate flatten(util.tokenize(sentence));
> dump B;
> {code}
> util.py
> {code}
> outputSchema("words:{(word:chararray)}")
> def tokenize(sentence):
>     return sentence.split(' ')
> {code}
> Error message:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function]
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: java.io.IOException: Error executing function
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
> 	... 11 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Cannot convert jython type (org.python.core.PyList) to pig datatype java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
> 	... 12 more
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
> 	... 13 more
> The problem is Pig expects a tuple inside a list, which is unintuitive in Python.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2739) PyList should map to Bag automatically in Jython

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-2739:
----------------------------

    Description: 
The following script does not work:
{code}
register 'util.py' using jython as util;
A = load '1.txt' as (sentence:chararray);
B = foreach A generate flatten(util.tokenize(sentence));
dump B;
{code}

util.py
{code}
outputSchema("words:{(word:chararray)}")
def tokenize(sentence):
    return sentence.split(' ')
{code}

Error message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function]
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.io.IOException: Error executing function
	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
	... 11 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Cannot convert jython type (org.python.core.PyList) to pig datatype java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
	... 12 more
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
	... 13 more

The problem is Pig expects a tuple inside a list, which is unintuitive in Python.

  was:
The following script does not work:
<code>
register 'util.py' using jython as util;
A = load '1.txt' as (sentence:chararray);
B = foreach A generate flatten(util.tokenize(sentence));
dump B;
<code>

util.py
<code>
outputSchema("words:{(word:chararray)}")
def tokenize(sentence):
    return sentence.split(' ')
<code>

Error message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function]
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.io.IOException: Error executing function
	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
	... 11 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Cannot convert jython type (org.python.core.PyList) to pig datatype java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
	... 12 more
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
	... 13 more

The problem is Pig expects a tuple inside a list, which is unintuitive in Python.

    
> PyList should map to Bag automatically in Jython
> ------------------------------------------------
>
>                 Key: PIG-2739
>                 URL: https://issues.apache.org/jira/browse/PIG-2739
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0, 0.11
>            Reporter: Daniel Dai
>            Assignee: Daniel Dai
>
> The following script does not work:
> {code}
> register 'util.py' using jython as util;
> A = load '1.txt' as (sentence:chararray);
> B = foreach A generate flatten(util.tokenize(sentence));
> dump B;
> {code}
> util.py
> {code}
> outputSchema("words:{(word:chararray)}")
> def tokenize(sentence):
>     return sentence.split(' ')
> {code}
> Error message:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing function]
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: java.io.IOException: Error executing function
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
> 	... 11 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: Cannot convert jython type (org.python.core.PyList) to pig datatype java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
> 	at org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
> 	... 12 more
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.pig.data.Tuple
> 	at org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
> 	... 13 more
> The problem is Pig expects a tuple inside a list, which is unintuitive in Python.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira