You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Viraj Bhat (JIRA)" <ji...@apache.org> on 2012/06/06 03:28:24 UTC

[jira] [Created] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Viraj Bhat created PIG-2741:
-------------------------------

             Summary: Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
                 Key: PIG-2741
                 URL: https://issues.apache.org/jira/browse/PIG-2741
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.10.0
         Environment: Pig 0.10
            Reporter: Viraj Bhat


I have a Python script which writes out data to HDFS
{code}
from org.apache.hadoop.conf import *
from org.apache.hadoop.fs import *

config = Configuration()
hdfs = FileSystem.get(config)
out = hdfs.create(Path("/user/viraj/junk.txt"))
out.write("Hello World!")
{code}

When I run this I get the following error:
{quote}
2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
*sys-package-mgr*: can't create package cache dir, '/mydir/xx'
2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
  File "/homes/viraj/test.py", line 4, in <module>
    config = Configuration()
NameError: name 'Configuration' is not defined

{quote}

I tried to solve it in various ways:

1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work

2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line

Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated PIG-2741:
------------------------------

    Attachment: pig-2741-testfailing-pig2665-v5.patch.txt

Thanks Daniel for adding e2e testcase. 
Added 1 line to the testase so that it would now fail without this patch.
{noformat}
#jython uses 'python.home'/cachedir when python.cachedir is not specified.
#To test python.cachedir is set correctly by the framework,
#setting python.home to a random path
                        'java_params' => ['-Dpython.home=/dev/null/fake'],
{noformat}

Confirmed that this test case
i) Fails without the patch (due to using /dev/null/fake as the cache dir)
ii) Succeeds with the patch (by using cache dir set by the framework.)
iii) Fails with PIG-2665 current patch due to 'python.cachedir.skip set to true in a standalone mode.

                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290653#comment-13290653 ] 

Rohini Palaniswamy commented on PIG-2741:
-----------------------------------------

>The problem is: user don't have privilege in default python.cachedir, the workaround is to give a > python.cachedir with privilege.
   
PythonInterpreter.initialize calls PMPySystemState.initCacheDirectory which does cachedir = new File(props.getProperty(PYTHON_CACHEDIR, CACHEDIR_DEFAULT_NAME));  This tries to create the cachedir in the same directory as the pig installation because PYTHON_CACHEDIR is not set by then. For it to work, we need a cachedir directory in the same directory as the pig.jar with 777 permissions which is not desirable. 
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-2741:
----------------------------

    Attachment: pig-2741-testfailing-pig2665-v5.patch.txt

That's an accident. Reattach the patch. Thanks.
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290798#comment-13290798 ] 

Koji Noguchi commented on PIG-2741:
-----------------------------------

bq. But the weird thing is the script fail with the same error message if I apply the patch PIG-2665.
This jira was failing because python.cachedir was set to incorrect path when initialized.
For PIG-2665 with jython-standalone-2.5.2.jar, it seems to be failing due to 'python.cachedir.skip' somehow set to true as default.
Error message is same but the cause is different.

bq. To make sure later patch does not break this script, please add a test case. 
Adding testcase for PIG-2665 failing is probably easy.  As for this jira, I don't know of a good way.  Owner of jython-2.5.0.jar (dir) and the user of the test needs to be different for this issue to happen. When I manually tested, I just mkdir ./build/ivy/lib/Pig/cachedir ; followed by chmod 000 ./build/ivy/lib/Pig/cachedir. 
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290612#comment-13290612 ] 

Daniel Dai commented on PIG-2741:
---------------------------------

Patch looks good for me. But the weird thing is the script fail with the same error message if I apply the patch PIG-2665.

To make sure later patch does not break this script, please add a test case. I will take a look of PIG-2665.
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290897#comment-13290897 ] 

Daniel Dai commented on PIG-2741:
---------------------------------

both 0.10 and trunk.
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289959#comment-13289959 ] 

Daniel Dai commented on PIG-2741:
---------------------------------

The problem is: user don't have privilege in default python.cachedir, the workaround is to give a python.cachedir with privilege. 
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290891#comment-13290891 ] 

Daniel Dai commented on PIG-2741:
---------------------------------

The test case only applicable to pig trunk. If we plan to commit to 0.10 branch, need make a test case for both 0.10 and trunk.
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated PIG-2741:
------------------------------

    Attachment: pig-2741-no-test-yet-v1.patch.txt

Took out the PythonInterpreter.initialize and delayed the argument setting to later.  Would this work?
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291486#comment-13291486 ] 

Rohini Palaniswamy commented on PIG-2741:
-----------------------------------------

Daniel,
   You seem to have put a hardcoded path in the test by mistake.
 +sys.path.append("/Users/daijy/hadoop-1.0.0/hadoop-core-1.0.0.jar")
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290659#comment-13290659 ] 

Rohini Palaniswamy commented on PIG-2741:
-----------------------------------------

Also, because the default cache directory is always pig_lib_dir/cachedir/packages, it might cause issues if multiple users are running the scripts. Koji's fix will get us back to the old code path in JythonScriptEngine, which generated and set a random directory for python cache directory.
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290347#comment-13290347 ] 

Koji Noguchi commented on PIG-2741:
-----------------------------------

Note that "can't create" error is coming before JythonScriptEngine is creating the python.cachedir dir.

bq. sys-package-mgr: can't create package cache dir, '/mydir/xx'
bq. 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512

This seems to be a regression from 
   PIG-2548: Support for providing parameters to python script

bq. src/org/apache/pig/scripting/jython/JythonScriptEngine.java
bq. 359                 PythonInterpreter.initialize(null, null, argv);

such that this initialization is called before python.cachedir is being updated by the JythonScriptEngine leading to this error.

                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai updated PIG-2741:
----------------------------

    Attachment: pig-2741-testfailing-pig2665-v4.patch.txt

Add a different test case which applicable to 
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Daniel Dai resolved PIG-2741.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.10.1
                   0.11
         Assignee: Koji Noguchi
     Hadoop Flags: Reviewed

Patch committed to both 0.10 branch and trunk. Thanks Koji!
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>            Assignee: Koji Noguchi
>             Fix For: 0.11, 0.10.1
>
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v6.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated PIG-2741:
------------------------------

    Attachment: pig-2741-testfailing-pig2665-v6.patch.txt

Hopefully this is the last one. Took out the 

bq. +sys.path.append("/Users/daijy/hadoop-1.0.0/hadoop-core-1.0.0.jar")
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v6.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated PIG-2741:
------------------------------

    Attachment: pig-2741-testfailing-pig2665-v3.patch.txt

Daniel asked me to add a brief comment on what the test is doing.  Updated.

Also, forgot to mention that during the testing, I found out that PIG_CMD_ARGS_REMAINDERS could be null when called from the unit test.  Added extra checking with warning to avoid the NPE.
                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2741) Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created

Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Koji Noguchi updated PIG-2741:
------------------------------

    Attachment: pig-2741-testfailing-pig2665-v2.patch.txt

bq. Adding testcase for PIG-2665 failing is probably easy. As for this jira, 

Added a testcase that would fail when tried with pig-2665 patch. 

For testing this jira itself, I manually tested it by 

$ chmod 000 ./build/ivy/lib/Pig/cachedir

and confirmed it fails with "NameError: name 'Configuration' is not defined"
and succeeds after the patch



                
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2741
>                 URL: https://issues.apache.org/jira/browse/PIG-2741
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>         Environment: Pig 0.10
>            Reporter: Viraj Bhat
>         Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO  org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO  org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO  org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
>   File "/homes/viraj/test.py", line 4, in <module>
>     config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira