You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Viraj Bhat (JIRA)" <ji...@apache.org> on 2012/06/06 03:28:24 UTC
[jira] [Created] (PIG-2741) Python script throws an NameError: name
'Configuration' is not defined in case cache dir is not created
Viraj Bhat created PIG-2741:
-------------------------------
Summary: Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
Key: PIG-2741
URL: https://issues.apache.org/jira/browse/PIG-2741
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.10.0
Environment: Pig 0.10
Reporter: Viraj Bhat
I have a Python script which writes out data to HDFS
{code}
from org.apache.hadoop.conf import *
from org.apache.hadoop.fs import *
config = Configuration()
hdfs = FileSystem.get(config)
out = hdfs.create(Path("/user/viraj/junk.txt"))
out.write("Hello World!")
{code}
When I run this I get the following error:
{quote}
2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
*sys-package-mgr*: can't create package cache dir, '/mydir/xx'
2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
File "/homes/viraj/test.py", line 4, in <module>
config = Configuration()
NameError: name 'Configuration' is not defined
{quote}
I tried to solve it in various ways:
1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2741) Python script throws an NameError: name
'Configuration' is not defined in case cache dir is not created
Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Noguchi updated PIG-2741:
------------------------------
Attachment: pig-2741-testfailing-pig2665-v5.patch.txt
Thanks Daniel for adding e2e testcase.
Added 1 line to the testase so that it would now fail without this patch.
{noformat}
#jython uses 'python.home'/cachedir when python.cachedir is not specified.
#To test python.cachedir is set correctly by the framework,
#setting python.home to a random path
'java_params' => ['-Dpython.home=/dev/null/fake'],
{noformat}
Confirmed that this test case
i) Fails without the patch (due to using /dev/null/fake as the cache dir)
ii) Succeeds with the patch (by using cache dir set by the framework.)
iii) Fails with PIG-2665 current patch due to 'python.cachedir.skip set to true in a standalone mode.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290653#comment-13290653 ]
Rohini Palaniswamy commented on PIG-2741:
-----------------------------------------
>The problem is: user don't have privilege in default python.cachedir, the workaround is to give a > python.cachedir with privilege.
PythonInterpreter.initialize calls PMPySystemState.initCacheDirectory which does cachedir = new File(props.getProperty(PYTHON_CACHEDIR, CACHEDIR_DEFAULT_NAME)); This tries to create the cachedir in the same directory as the pig installation because PYTHON_CACHEDIR is not set by then. For it to work, we need a cachedir directory in the same directory as the pig.jar with 777 permissions which is not desirable.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2741) Python script throws an NameError: name
'Configuration' is not defined in case cache dir is not created
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-2741:
----------------------------
Attachment: pig-2741-testfailing-pig2665-v5.patch.txt
That's an accident. Reattach the patch. Thanks.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290798#comment-13290798 ]
Koji Noguchi commented on PIG-2741:
-----------------------------------
bq. But the weird thing is the script fail with the same error message if I apply the patch PIG-2665.
This jira was failing because python.cachedir was set to incorrect path when initialized.
For PIG-2665 with jython-standalone-2.5.2.jar, it seems to be failing due to 'python.cachedir.skip' somehow set to true as default.
Error message is same but the cause is different.
bq. To make sure later patch does not break this script, please add a test case.
Adding testcase for PIG-2665 failing is probably easy. As for this jira, I don't know of a good way. Owner of jython-2.5.0.jar (dir) and the user of the test needs to be different for this issue to happen. When I manually tested, I just mkdir ./build/ivy/lib/Pig/cachedir ; followed by chmod 000 ./build/ivy/lib/Pig/cachedir.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290612#comment-13290612 ]
Daniel Dai commented on PIG-2741:
---------------------------------
Patch looks good for me. But the weird thing is the script fail with the same error message if I apply the patch PIG-2665.
To make sure later patch does not break this script, please add a test case. I will take a look of PIG-2665.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290897#comment-13290897 ]
Daniel Dai commented on PIG-2741:
---------------------------------
both 0.10 and trunk.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289959#comment-13289959 ]
Daniel Dai commented on PIG-2741:
---------------------------------
The problem is: user don't have privilege in default python.cachedir, the workaround is to give a python.cachedir with privilege.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290891#comment-13290891 ]
Daniel Dai commented on PIG-2741:
---------------------------------
The test case only applicable to pig trunk. If we plan to commit to 0.10 branch, need make a test case for both 0.10 and trunk.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2741) Python script throws an NameError: name
'Configuration' is not defined in case cache dir is not created
Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Noguchi updated PIG-2741:
------------------------------
Attachment: pig-2741-no-test-yet-v1.patch.txt
Took out the PythonInterpreter.initialize and delayed the argument setting to later. Would this work?
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13291486#comment-13291486 ]
Rohini Palaniswamy commented on PIG-2741:
-----------------------------------------
Daniel,
You seem to have put a hardcoded path in the test by mistake.
+sys.path.append("/Users/daijy/hadoop-1.0.0/hadoop-core-1.0.0.jar")
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Rohini Palaniswamy (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290659#comment-13290659 ]
Rohini Palaniswamy commented on PIG-2741:
-----------------------------------------
Also, because the default cache directory is always pig_lib_dir/cachedir/packages, it might cause issues if multiple users are running the scripts. Koji's fix will get us back to the old code path in JythonScriptEngine, which generated and set a random directory for python cache directory.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13290347#comment-13290347 ]
Koji Noguchi commented on PIG-2741:
-----------------------------------
Note that "can't create" error is coming before JythonScriptEngine is creating the python.cachedir dir.
bq. sys-package-mgr: can't create package cache dir, '/mydir/xx'
bq. 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
This seems to be a regression from
PIG-2548: Support for providing parameters to python script
bq. src/org/apache/pig/scripting/jython/JythonScriptEngine.java
bq. 359 PythonInterpreter.initialize(null, null, argv);
such that this initialization is called before python.cachedir is being updated by the JythonScriptEngine leading to this error.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2741) Python script throws an NameError: name
'Configuration' is not defined in case cache dir is not created
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-2741:
----------------------------
Attachment: pig-2741-testfailing-pig2665-v4.patch.txt
Add a different test case which applicable to
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (PIG-2741) Python script throws an NameError:
name 'Configuration' is not defined in case cache dir is not created
Posted by "Daniel Dai (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai resolved PIG-2741.
-----------------------------
Resolution: Fixed
Fix Version/s: 0.10.1
0.11
Assignee: Koji Noguchi
Hadoop Flags: Reviewed
Patch committed to both 0.10 branch and trunk. Thanks Koji!
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Assignee: Koji Noguchi
> Fix For: 0.11, 0.10.1
>
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v6.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2741) Python script throws an NameError: name
'Configuration' is not defined in case cache dir is not created
Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Noguchi updated PIG-2741:
------------------------------
Attachment: pig-2741-testfailing-pig2665-v6.patch.txt
Hopefully this is the last one. Took out the
bq. +sys.path.append("/Users/daijy/hadoop-1.0.0/hadoop-core-1.0.0.jar")
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt, pig-2741-testfailing-pig2665-v4.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v5.patch.txt, pig-2741-testfailing-pig2665-v6.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2741) Python script throws an NameError: name
'Configuration' is not defined in case cache dir is not created
Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Noguchi updated PIG-2741:
------------------------------
Attachment: pig-2741-testfailing-pig2665-v3.patch.txt
Daniel asked me to add a brief comment on what the test is doing. Updated.
Also, forgot to mention that during the testing, I found out that PIG_CMD_ARGS_REMAINDERS could be null when called from the unit test. Added extra checking with warning to avoid the NPE.
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt, pig-2741-testfailing-pig2665-v3.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PIG-2741) Python script throws an NameError: name
'Configuration' is not defined in case cache dir is not created
Posted by "Koji Noguchi (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Koji Noguchi updated PIG-2741:
------------------------------
Attachment: pig-2741-testfailing-pig2665-v2.patch.txt
bq. Adding testcase for PIG-2665 failing is probably easy. As for this jira,
Added a testcase that would fail when tried with pig-2665 patch.
For testing this jira itself, I manually tested it by
$ chmod 000 ./build/ivy/lib/Pig/cachedir
and confirmed it fails with "NameError: name 'Configuration' is not defined"
and succeeds after the patch
> Python script throws an NameError: name 'Configuration' is not defined in case cache dir is not created
> -------------------------------------------------------------------------------------------------------
>
> Key: PIG-2741
> URL: https://issues.apache.org/jira/browse/PIG-2741
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0
> Environment: Pig 0.10
> Reporter: Viraj Bhat
> Attachments: pig-2741-no-test-yet-v1.patch.txt, pig-2741-testfailing-pig2665-v2.patch.txt
>
>
> I have a Python script which writes out data to HDFS
> {code}
> from org.apache.hadoop.conf import *
> from org.apache.hadoop.fs import *
> config = Configuration()
> hdfs = FileSystem.get(config)
> out = hdfs.create(Path("/user/viraj/junk.txt"))
> out.write("Hello World!")
> {code}
> When I run this I get the following error:
> {quote}
> 2012-06-06 01:20:43,101 [main] INFO org.apache.pig.Main - Logging error messages to: /home/viraj/pig_1338945643097.log
> 2012-06-06 01:20:43,502 [main] INFO org.apache.pig.Main - Run embedded script: jython
> 2012-06-06 01:20:43,603 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://namenode:8020
> 2012-06-06 01:20:44,069 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: jobtracker:50300
> *sys-package-mgr*: can't create package cache dir, '/mydir/xx'
> 2012-06-06 01:20:45,815 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_7126458276821733512
> 2012-06-06 01:20:45,904 [main] ERROR org.apache.pig.Main - ERROR 1121: Python Error. Traceback (most recent call last):
> File "/homes/viraj/test.py", line 4, in <module>
> config = Configuration()
> NameError: name 'Configuration' is not defined
> {quote}
> I tried to solve it in various ways:
> 1) Override pig.properties to specify python.cachedir.skip=false but it does not seem to work
> 2) The only workaround is to: specify: -Dpython.cachedir=/mydirectory/tmp on the command line
> Viraj
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira