You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Ray Duong <ra...@gmail.com> on 2010/05/13 19:58:53 UTC

Referencing External Files from Hive

Hi,

I have two python scripts that is loaded into Hive, one of the python script
reference the other file as class file.  However, when I run the transform
statement with the first python script, it says it can't reference the other
file in the import header.

So, is there a way to reference the other python script?  Or do I have embed
all the files into one file?  BTW, when I add file, which directory on the
Slaves does the files copy to?

Thanks,
-ray

Python: foo.py

#!/usr/bin/env python

from bar import bar1


Hive:

add file foo.py
add file bar.py

select
  transform(x, y)
  using 'python foo.py'
  as x, y, z
from
  footable;

*stderr logs*

Traceback (most recent call last):
  File "foo.py", line 6, in ?
    from bar import bar1
ImportError: No module named bar

Re: Referencing External Files from Hive

Posted by Ray Duong <ra...@gmail.com>.
Thanks, that worked.

-ray

On Thu, May 13, 2010 at 1:06 PM, Dilip Joseph <dilip.antony.joseph@gmail.com
> wrote:

> I had the same problem.  Adding the following lines before the import
> solved the problem:
>
> import sys
> import os
> sys.path.append(os.getcwd())
>
> Dilip
>
> On Thu, May 13, 2010 at 10:58 AM, Ray Duong <ra...@gmail.com> wrote:
> > Hi,
> > I have two python scripts that is loaded into Hive, one of the python
> script
> > reference the other file as class file.  However, when I run the
> transform
> > statement with the first python script, it says it can't reference the
> other
> > file in the import header.
> > So, is there a way to reference the other python script?  Or do I have
> embed
> > all the files into one file?  BTW, when I add file, which directory on
> the
> > Slaves does the files copy to?
> > Thanks,
> > -ray
> > Python: foo.py
> > #!/usr/bin/env python
> > from bar import bar1
> >
> > Hive:
> > add file foo.py
> > add file bar.py
> > select
> >   transform(x, y)
> >   using 'python foo.py'
> >   as x, y, z
> > from
> >   footable;
> > stderr logs
> >
> > Traceback (most recent call last):
> >   File "foo.py", line 6, in ?
> >     from bar import bar1
> > ImportError: No module named bar
> >
>
>
>
> --
> _________________________________________
> Dilip Antony Joseph
> http://www.marydilip.info
>

Re: Referencing External Files from Hive

Posted by Dilip Joseph <di...@gmail.com>.
I had the same problem.  Adding the following lines before the import
solved the problem:

import sys
import os
sys.path.append(os.getcwd())

Dilip

On Thu, May 13, 2010 at 10:58 AM, Ray Duong <ra...@gmail.com> wrote:
> Hi,
> I have two python scripts that is loaded into Hive, one of the python script
> reference the other file as class file.  However, when I run the transform
> statement with the first python script, it says it can't reference the other
> file in the import header.
> So, is there a way to reference the other python script?  Or do I have embed
> all the files into one file?  BTW, when I add file, which directory on the
> Slaves does the files copy to?
> Thanks,
> -ray
> Python: foo.py
> #!/usr/bin/env python
> from bar import bar1
>
> Hive:
> add file foo.py
> add file bar.py
> select
>   transform(x, y)
>   using 'python foo.py'
>   as x, y, z
> from
>   footable;
> stderr logs
>
> Traceback (most recent call last):
>   File "foo.py", line 6, in ?
>     from bar import bar1
> ImportError: No module named bar
>



-- 
_________________________________________
Dilip Antony Joseph
http://www.marydilip.info