You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Ray <fe...@gmail.com> on 2012/10/08 11:28:18 UTC

Re: How can I access secure HBase in UDF

Rohini,

I can get HBASE_AUTH_TOKEN and set it into UDFContext.
But I still don't know how to use it in exec() of EvalFunc.
You can only get "Configuration" object not the "Job" object from
UDFContext.
How to make the token be used when HTable is created?

Thanks,
Ray

On Fri, Sep 28, 2012 at 1:54 AM, Rohini Palaniswamy <rohini.aditya@gmail.com
> wrote:

> Ray,
>    In the frontend, you can do a new JobConf(HBaseConfiguration.create())
> and pass that to TableMapReduceUtil.initCredentials(). It will fetch the
> credentials and add to JobConf.credentials. You can then loop through
> jobConf.getCredentials().getAllTokens() and get the token of kind
> HBASE_AUTH_TOKEN and add it to UDFContext. On the backend you need to get
> the token from UDFContext and add it to the JobConf you create HTable with.
>
> Regards,
> Rohini
>
> On Thu, Sep 27, 2012 at 4:19 AM, Ray <fe...@gmail.com> wrote:
>
> > Thank you guys for your replies.
> >
> > TableMapReduceUtil.initCredentials() needs a Job object as argument not a
> > Configuration object returned by UDFContext.getUDFContext().getJobConf.
> >
> > I didn't dive into the hbase credentials very deeply, so I don't know how
> > to create a credential file and cache it.
> > Only I know is to use some utilities provided by hadoop and hbase to
> create
> > credentials for Job like TableMapReduceUtil.initCredentials().
> >
> > But to use the utilities, I need get the Job object when it's being
> > constructed in backend like the setLocation() of LoadFunc. Then the
> > credentials can take effect when job run.
> > I guess I can't do this in EvalFunc.
> >
> >
> > On Wed, Sep 26, 2012 at 11:26 PM, Ashutosh Chauhan <hashutosh@apache.org
> > >wrote:
> >
> > > I will suggest to use HBase api to manage your credentials, instead
> > > handling them of your own.
> > >
> > >
> >
> TableMapReduceUtil.initCredentials(UDFContext.getUDFContext().getJobConf);
> > >
> > > Ashutosh
> > > On Tue, Sep 25, 2012 at 10:38 PM, Rohini Palaniswamy <
> > > rohini.aditya@gmail.com> wrote:
> > >
> > > > That should work too. Just a note that UDFContext is serialized in
> > > JobConf
> > > > and would be visible to all (though base64 encoded) through
> Jobtracker
> > UI
> > > > unless acls for viewing are set. If you are not that paranoid about
> > > > security as in Yahoo! then it should be fine.
> > > >
> > > > -Rohini
> > > >
> > > > On Tue, Sep 25, 2012 at 5:09 PM, Alan Gates <ga...@hortonworks.com>
> > > wrote:
> > > >
> > > > > You can use the UDFContext to pass information for the UDF in the
> > > JobConf
> > > > > without writing files.
> > > > >
> > > > > Alan.
> > > > >
> > > > > On Sep 25, 2012, at 10:48 AM, Rohini Palaniswamy wrote:
> > > > >
> > > > > > Ray,
> > > > > >   Looking at the EvalFunc interface, I can not see a way or
> > loophole
> > > to
> > > > > do
> > > > > > it.  EvalFunc does not have a reference to Job or JobConf object
> to
> > > add
> > > > > > credentials to it. It has getCacheFiles() to add files to
> > > > > DistributedCache,
> > > > > > but no method to add credentials. We should probably add one. The
> > not
> > > > so
> > > > > > nice workaround I can think of is to add the token as a file
> > > > > > to DistributedCache using getCacheFiles() and read the file
> > yourself
> > > in
> > > > > > EvalFunc and use it in communication with HBase.
> > > > > >
> > > > > > Regards,
> > > > > > Rohini
> > > > > >
> > > > > > On Tue, Sep 25, 2012 at 1:51 AM, Ray <fe...@gmail.com> wrote:
> > > > > >
> > > > > >> Hi,
> > > > > >>
> > > > > >> I have a requirement to access HBase in UDF. But the HBase is
> > > > > configured to
> > > > > >> be secure, which needs a credential when being connected in a
> > > > mapreduce
> > > > > >> job.
> > > > > >> I see you have added support of secure HBase in HBaseStorage
> > > > > >>
> > > > >
> > >
> https://issues.apache.org/jira/browse/PIG-2821?attachmentSortBy=dateTime
> > > > > >> But my UDF is an EvalFunc not Loader.
> > > > > >>
> > > > > >> Could you tell me how I can achieve this? Or is there any way I
> > can
> > > > add
> > > > > the
> > > > > >> credential in Job when the job is setup in backend?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Ray
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: How can I access secure HBase in UDF

Posted by Rohini Palaniswamy <ro...@gmail.com>.
Ray,
   I missed your reply. You might have already figured it out by now. I
just created https://issues.apache.org/jira/browse/PIG-3030 to provide a
helper class for this. I will put up a patch by tomorrow.

Regards,
Rohini

On Mon, Oct 8, 2012 at 2:28 AM, Ray <fe...@gmail.com> wrote:

> Rohini,
>
> I can get HBASE_AUTH_TOKEN and set it into UDFContext.
> But I still don't know how to use it in exec() of EvalFunc.
> You can only get "Configuration" object not the "Job" object from
> UDFContext.
> How to make the token be used when HTable is created?
>
> Thanks,
> Ray
>
> On Fri, Sep 28, 2012 at 1:54 AM, Rohini Palaniswamy <
> rohini.aditya@gmail.com
> > wrote:
>
> > Ray,
> >    In the frontend, you can do a new JobConf(HBaseConfiguration.create())
> > and pass that to TableMapReduceUtil.initCredentials(). It will fetch the
> > credentials and add to JobConf.credentials. You can then loop through
> > jobConf.getCredentials().getAllTokens() and get the token of kind
> > HBASE_AUTH_TOKEN and add it to UDFContext. On the backend you need to get
> > the token from UDFContext and add it to the JobConf you create HTable
> with.
> >
> > Regards,
> > Rohini
> >
> > On Thu, Sep 27, 2012 at 4:19 AM, Ray <fe...@gmail.com> wrote:
> >
> > > Thank you guys for your replies.
> > >
> > > TableMapReduceUtil.initCredentials() needs a Job object as argument
> not a
> > > Configuration object returned by UDFContext.getUDFContext().getJobConf.
> > >
> > > I didn't dive into the hbase credentials very deeply, so I don't know
> how
> > > to create a credential file and cache it.
> > > Only I know is to use some utilities provided by hadoop and hbase to
> > create
> > > credentials for Job like TableMapReduceUtil.initCredentials().
> > >
> > > But to use the utilities, I need get the Job object when it's being
> > > constructed in backend like the setLocation() of LoadFunc. Then the
> > > credentials can take effect when job run.
> > > I guess I can't do this in EvalFunc.
> > >
> > >
> > > On Wed, Sep 26, 2012 at 11:26 PM, Ashutosh Chauhan <
> hashutosh@apache.org
> > > >wrote:
> > >
> > > > I will suggest to use HBase api to manage your credentials, instead
> > > > handling them of your own.
> > > >
> > > >
> > >
> >
> TableMapReduceUtil.initCredentials(UDFContext.getUDFContext().getJobConf);
> > > >
> > > > Ashutosh
> > > > On Tue, Sep 25, 2012 at 10:38 PM, Rohini Palaniswamy <
> > > > rohini.aditya@gmail.com> wrote:
> > > >
> > > > > That should work too. Just a note that UDFContext is serialized in
> > > > JobConf
> > > > > and would be visible to all (though base64 encoded) through
> > Jobtracker
> > > UI
> > > > > unless acls for viewing are set. If you are not that paranoid about
> > > > > security as in Yahoo! then it should be fine.
> > > > >
> > > > > -Rohini
> > > > >
> > > > > On Tue, Sep 25, 2012 at 5:09 PM, Alan Gates <gates@hortonworks.com
> >
> > > > wrote:
> > > > >
> > > > > > You can use the UDFContext to pass information for the UDF in the
> > > > JobConf
> > > > > > without writing files.
> > > > > >
> > > > > > Alan.
> > > > > >
> > > > > > On Sep 25, 2012, at 10:48 AM, Rohini Palaniswamy wrote:
> > > > > >
> > > > > > > Ray,
> > > > > > >   Looking at the EvalFunc interface, I can not see a way or
> > > loophole
> > > > to
> > > > > > do
> > > > > > > it.  EvalFunc does not have a reference to Job or JobConf
> object
> > to
> > > > add
> > > > > > > credentials to it. It has getCacheFiles() to add files to
> > > > > > DistributedCache,
> > > > > > > but no method to add credentials. We should probably add one.
> The
> > > not
> > > > > so
> > > > > > > nice workaround I can think of is to add the token as a file
> > > > > > > to DistributedCache using getCacheFiles() and read the file
> > > yourself
> > > > in
> > > > > > > EvalFunc and use it in communication with HBase.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Rohini
> > > > > > >
> > > > > > > On Tue, Sep 25, 2012 at 1:51 AM, Ray <fe...@gmail.com>
> wrote:
> > > > > > >
> > > > > > >> Hi,
> > > > > > >>
> > > > > > >> I have a requirement to access HBase in UDF. But the HBase is
> > > > > > configured to
> > > > > > >> be secure, which needs a credential when being connected in a
> > > > > mapreduce
> > > > > > >> job.
> > > > > > >> I see you have added support of secure HBase in HBaseStorage
> > > > > > >>
> > > > > >
> > > >
> > https://issues.apache.org/jira/browse/PIG-2821?attachmentSortBy=dateTime
> > > > > > >> But my UDF is an EvalFunc not Loader.
> > > > > > >>
> > > > > > >> Could you tell me how I can achieve this? Or is there any way
> I
> > > can
> > > > > add
> > > > > > the
> > > > > > >> credential in Job when the job is setup in backend?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Ray
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>