You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Chris Olston <ol...@yahoo-inc.com> on 2009/01/09 19:02:44 UTC
how to access HDFS from inside a Pig UDF
Hi,
This may have already been asked, but I couldn't find anything in old mails
... I did find an old bug report PIG-66 about this but it got closed with no
pointer to what the outcome was.
My question:
Is there any way to get a handle on the HDFS from inside a Pig UDF (in
particular, a StoreFunc)?
(Alternatively, if I can get the hadoop JobConf that would allow me to get
the HDFS by calling FileSystem.get(conf).)
My use case is:
I'm building a StoreFunc that creates a Lucene index, following the rubric
from the hadoop.contrib.index code, in which you first have Lucene create
index files in the local FS, and then copy them to the HDFS.
Thanks!
-Chris
--
Christopher Olston, Ph.D.
Sr. Research Scientist
Yahoo! Research
Re: how to access HDFS from inside a Pig UDF
Posted by Ted Dunning <te...@gmail.com>.
Sounds right to me (as a non-implementor).
On Fri, Jan 9, 2009 at 10:37 AM, Craig Macdonald <cr...@dcs.gla.ac.uk>wrote:
> Would it make sense if Pig automatically detected UDFs that implements
> hadoop.conf.Configurable, and automatically passed these the JobConf?
>
>
Re: how to access HDFS from inside a Pig UDF
Posted by Craig Macdonald <cr...@dcs.gla.ac.uk>.
Would it make sense if Pig automatically detected UDFs that implements
hadoop.conf.Configurable, and automatically passed these the JobConf?
Craig
Chris Olston wrote:
> Hi,
>
> This may have already been asked, but I couldn't find anything in old mails
> ... I did find an old bug report PIG-66 about this but it got closed with no
> pointer to what the outcome was.
>
> My question:
> Is there any way to get a handle on the HDFS from inside a Pig UDF (in
> particular, a StoreFunc)?
> (Alternatively, if I can get the hadoop JobConf that would allow me to get
> the HDFS by calling FileSystem.get(conf).)
>
> My use case is:
> I'm building a StoreFunc that creates a Lucene index, following the rubric
> from the hadoop.contrib.index code, in which you first have Lucene create
> index files in the local FS, and then copy them to the HDFS.
>
> Thanks!
>
> -Chris
>
> --
> Christopher Olston, Ph.D.
> Sr. Research Scientist
> Yahoo! Research
>
>
>
>
>