You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by James Srinivasan <ja...@gmail.com> on 2017/06/07 19:54:03 UTC

Re: ClientConfiguration using Kerberos & MapReduce

[snip]
>> Fortunately I found this:
>>
>> https://github.com/apache/hive/blob/master/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/HiveAccumuloTableInputFormat.java
>>
>> Is it a good example of Accumulo + MapReduce that I can copy?
> That one is definitely over-kill. There's a bit of reflection in there to
> work around older versions of Accumulo. However, it should be an example of
> something that does work with Kerberos authentication.
> Also, take note that Hive uses the InputFormat regardless of the execution
> engine (local, MapReduce, Tez, etc). There are some comments to that effect
> in the code. You can likely simplify those methods/blocks as well :)

Think those are two things I'll need to handle at some point anyways.
I think I'm setting all the AccumuloInputFormat statics correctly, and
see the DelegationToken in my job's and context's credentials.
However, my custom InputFormat's createRecordReader function needs to
connect to Accumulo to get some config. Am I right in thinking I need
to convert the Hadoop wrapped token (kind=ACCUMULO_AUTH_TOKEN) into an
Accumulo DelegationToken to create my connector? If so, how do I do
that?

Thanks very much,

James

Re: ClientConfiguration using Kerberos & MapReduce

Posted by James Srinivasan <ja...@gmail.com>.
>> Brilliant - that's it! Now my custom InputFormat is working.
> If you're interested and have the ability to cut/paste your code, updating
> the Accumulo examples[1] with more of the nitty-gritty on using
> Kerberos+MapReduce would be great! You definitely have the hard-stuff done
> :)
>
> [1] https://accumulo.apache.org/1.8/examples/

I would...but it's Scala. The most interesting bit is here:

https://github.com/jrs53/geomesa/blob/17a3da56a041f5dde5d61d57fc3ca92dff0a1dc0/geomesa-accumulo/geomesa-accumulo-jobs/src/main/scala/org/locationtech/geomesa/jobs/mapreduce/GeoMesaAccumuloInputFormat.scala#L176-L215

Re: ClientConfiguration using Kerberos & MapReduce

Posted by Josh Elser <jo...@gmail.com>.
On 6/8/17 4:10 PM, James Srinivasan wrote:
> [snip]
>> https://github.com/apache/accumulo/blob/f81a8ec7410e789d11941351d5899b8894c6a322/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/ConfiguratorBase.java#L485-L500
>>
>> This pulls the "DelegationTokenStub" out of the InputFormat and creates a
>> real Accumulo AuthenticationToken (which you can use with a Connector
>> per-usual).
> 
> Brilliant - that's it! Now my custom InputFormat is working.
> 
> Thanks ever so much for your help - it is hugely appreciated!
> 
> James
> 

Fantastic. Glad to hear it!

If you're interested and have the ability to cut/paste your code, 
updating the Accumulo examples[1] with more of the nitty-gritty on using 
Kerberos+MapReduce would be great! You definitely have the hard-stuff 
done :)

[1] https://accumulo.apache.org/1.8/examples/

Re: ClientConfiguration using Kerberos & MapReduce

Posted by James Srinivasan <ja...@gmail.com>.
[snip]
> https://github.com/apache/accumulo/blob/f81a8ec7410e789d11941351d5899b8894c6a322/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/ConfiguratorBase.java#L485-L500
>
> This pulls the "DelegationTokenStub" out of the InputFormat and creates a
> real Accumulo AuthenticationToken (which you can use with a Connector
> per-usual).

Brilliant - that's it! Now my custom InputFormat is working.

Thanks ever so much for your help - it is hugely appreciated!

James

Re: ClientConfiguration using Kerberos & MapReduce

Posted by Josh Elser <jo...@gmail.com>.
On 6/7/17 3:54 PM, James Srinivasan wrote:
> [snip]
>>> Fortunately I found this:
>>>
>>> https://github.com/apache/hive/blob/master/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/HiveAccumuloTableInputFormat.java
>>>
>>> Is it a good example of Accumulo + MapReduce that I can copy?
>> That one is definitely over-kill. There's a bit of reflection in there to
>> work around older versions of Accumulo. However, it should be an example of
>> something that does work with Kerberos authentication.
>> Also, take note that Hive uses the InputFormat regardless of the execution
>> engine (local, MapReduce, Tez, etc). There are some comments to that effect
>> in the code. You can likely simplify those methods/blocks as well :)
> 
> Think those are two things I'll need to handle at some point anyways.
> I think I'm setting all the AccumuloInputFormat statics correctly, and
> see the DelegationToken in my job's and context's credentials.
> However, my custom InputFormat's createRecordReader function needs to
> connect to Accumulo to get some config. Am I right in thinking I need
> to convert the Hadoop wrapped token (kind=ACCUMULO_AUTH_TOKEN) into an
> Accumulo DelegationToken to create my connector? If so, how do I do
> that?

Yes, you need to deserialize the AuthenticationToken from the 
InputSplit. You can look back into the AccumuloInputFormat 
implementation to see how this is done:

https://github.com/apache/accumulo/blob/f81a8ec7410e789d11941351d5899b8894c6a322/core/src/main/java/org/apache/accumulo/core/client/mapreduce/AbstractInputFormat.java#L515-L518

calls

https://github.com/apache/accumulo/blob/f81a8ec7410e789d11941351d5899b8894c6a322/core/src/main/java/org/apache/accumulo/core/client/mapreduce/AbstractInputFormat.java#L240-L243

calls

https://github.com/apache/accumulo/blob/f81a8ec7410e789d11941351d5899b8894c6a322/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/impl/ConfiguratorBase.java#L485-L500

This pulls the "DelegationTokenStub" out of the InputFormat and creates 
a real Accumulo AuthenticationToken (which you can use with a Connector 
per-usual).