You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Chris Olston <ol...@yahoo-inc.com> on 2009/01/09 19:02:44 UTC

how to access HDFS from inside a Pig UDF

Hi,

This may have already been asked, but I couldn't find anything in old mails
... I did find an old bug report PIG-66 about this but it got closed with no
pointer to what the outcome was.

My question:
Is there any way to get a handle on the HDFS from inside a Pig UDF (in
particular, a StoreFunc)?
(Alternatively, if I can get the hadoop JobConf that would allow me to get
the HDFS by calling FileSystem.get(conf).)

My use case is:
I'm building a StoreFunc that creates a Lucene index, following the rubric
from the hadoop.contrib.index code, in which you first have Lucene create
index files in the local FS, and then copy them to the HDFS.

Thanks!

-Chris

--
Christopher Olston, Ph.D.
Sr. Research Scientist
Yahoo! Research





Re: how to access HDFS from inside a Pig UDF

Posted by Ted Dunning <te...@gmail.com>.
Sounds right to me (as a non-implementor).

On Fri, Jan 9, 2009 at 10:37 AM, Craig Macdonald <cr...@dcs.gla.ac.uk>wrote:

> Would it make sense if Pig automatically detected UDFs that implements
> hadoop.conf.Configurable, and automatically passed these the JobConf?
>
>

Re: how to access HDFS from inside a Pig UDF

Posted by Craig Macdonald <cr...@dcs.gla.ac.uk>.
Would it make sense if Pig automatically detected UDFs that implements 
hadoop.conf.Configurable, and automatically passed these the JobConf?

Craig

Chris Olston wrote:
> Hi,
>
> This may have already been asked, but I couldn't find anything in old mails
> ... I did find an old bug report PIG-66 about this but it got closed with no
> pointer to what the outcome was.
>
> My question:
> Is there any way to get a handle on the HDFS from inside a Pig UDF (in
> particular, a StoreFunc)?
> (Alternatively, if I can get the hadoop JobConf that would allow me to get
> the HDFS by calling FileSystem.get(conf).)
>
> My use case is:
> I'm building a StoreFunc that creates a Lucene index, following the rubric
> from the hadoop.contrib.index code, in which you first have Lucene create
> index files in the local FS, and then copy them to the HDFS.
>
> Thanks!
>
> -Chris
>
> --
> Christopher Olston, Ph.D.
> Sr. Research Scientist
> Yahoo! Research
>
>
>
>
>