You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by Tahir Hameed <ta...@gmail.com> on 2015/09/30 22:13:30 UTC

GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

Hi,

I am facing a queer problem. I have 2 MR pipelines. One of them is working
fine. The other is not.

The difference lies in only one of the DoFN functions.


The DoFn function which fails is given below:

    public PTable<ImmutableBytesWritable, CE>
myFunction(PTable<ImmutableBytesWritable, Pair<A, B>> joinedData,
PTable<String, C> others) {

        ReadableData<Pair<String, C>> readable = others.asReadable(false);
        ParallelDoOptions options = ParallelDoOptions.builder()
                .sourceTargets(readable.getSourceTargets())
                .build();

        return joinedData
                .by(someMapFunction,
Avros.writables(ImmutableBytesWritable.class))
                .groupByKey()
                .parallelDo("", new CEDoFN(readable,
others.getPTableType()),

Avros.tableOf(Avros.writables(ImmutableBytesWritable.class),
Avros.reflects(CE.class)), options);

    }

The stack trace is as follows :

javax.security.sasl.SaslException: GSS initiate failed [Caused by
GSSException: No valid credentials provided (Mechanism level: Failed
to find any Kerberos tgt)]
	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
	at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:177)
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:815)
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$800(RpcClient.java:349)
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:943)
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:940)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:940)
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.writeRequest(RpcClient.java:1094)
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.tracedWriteRequest(RpcClient.java:1061)
	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1516)
	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1724)
	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1777)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:30373)
	at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1604)
	at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:768)
	at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:766)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
	at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:772)
	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.prefetchRegionCache(ConnectionManager.java:1254)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1318)
	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1167)
	at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:294)
	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:130)
	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:55)
	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:201)
	at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
	at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)
	at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)
	at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)
	at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:802)
	at org.apache.crunch.io.hbase.HTableIterator.<init>(HTableIterator.java:47)
	at org.apache.crunch.io.hbase.HTableIterable.iterator(HTableIterable.java:43)
	at org.apache.crunch.util.DelegatingReadableData$1.iterator(DelegatingReadableData.java:63)
	at com.bol.step.enrichmentdashboard.fn.CEDoFN.initialize(CEDoFN.java:45)
	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:71)
	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:73)
	at org.apache.crunch.impl.mr.run.CrunchReducer.setup(CrunchReducer.java:44)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:168)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: GSSException: No valid credentials provided (Mechanism
level: Failed to find any Kerberos tgt)


In the CEDoFunction, the readable is used in the initialization phase to
create a HashMap. This is the place where the stack trace error also points
to.

In the function which succeeds, the parallelDo is performed directly on the
joinedData which is also a PTable, and there are no errors. The
initialization phases for both functions are exactly the same.

I fail to understand the cause of the errors because the underlying
implementations for the both PTable and PGroupedTable is the same because
both seem to be extending the PCollectionImpl interface.

Tahir

Re: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

Posted by Gabriel Reid <ga...@gmail.com>.
On Thu, Oct 1, 2015 at 12:32 PM, Tahir Hameed <ta...@gmail.com> wrote:
> I tried the above method, but I'm using version 0.98.4.2.2.4.4-16-hadoop2
> for HBase on the cluster, and version 0.12.0-hadoop2 for Apache Crunch. I've
> tried using using 0.11.0-hadoop2 by applying a patch for CRUNCH-536 but I
> fall into other errors. I havent been able to find a git release for
> 0.12.0-hadoop2 to add the CRUNCH-536 changes to it .

You can check out the 0.13 release from here:
https://github.com/apache/crunch/tree/apache-crunch-0.13.0 (starting
from 0.13, Crunch is hadoop2-only). This release includes the changes
for CRUNCH-536.

>
> Also, I am using  already TableMapReduceUtil.initCredentials(mrJob.getJob())
> in my own code as well for all tables I read . I read the table, and convert
> that into a readable instance to be accessed into another DoFn. With 2
> pipelines, one of them working absolutely fine, (no errors) and the other
> one having kerberos authentication errors. The only difference I see is of
> use of PTable in one, and PGroupedTable in the other.

The use of a PTable vs a PGroupedTable affects the topology of the job
graph, which also changes what gets executed in which job. This can
also definitely have an effect on using ReadableData. From the stack
trace that you posted, it definitely looked like an HTable was being
read from within the initialize method of your DoFn -- even though
your Crunch code probably defines the HBase-based PTable earlier, it's
only being read as needed (which appears to be in the initialize
method).

>
> Would sharing the code for instance be more helpful in identifying the
> problem?

Yes, definitely. If you can share a minimal example of the full code
that works and a version that doesn't work, it'll probably help a lot
in resolving this.

- Gabriel

Re: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

Posted by Tahir Hameed <ta...@gmail.com>.
Thanks for the feedback,

I tried the above method, but I'm using version 0.98.4.2.2.4.4-16-hadoop2
for HBase on the cluster, and version 0.12.0-hadoop2 for Apache Crunch.
I've tried using using 0.11.0-hadoop2 by applying a patch for CRUNCH-536
but I fall into other errors. I havent been able to find a git release
for 0.12.0-hadoop2
to add the CRUNCH-536 changes to it .

Also, I am using  already TableMapReduceUtil.initCredentials(mrJob.getJob())
in my own code as well for all tables I read . I read the table, and
convert that into a readable instance to be accessed into another DoFn.
With 2 pipelines, one of them working absolutely fine, (no errors) and the
other one having kerberos authentication errors. The only difference I see
is of use of PTable in one, and PGroupedTable in the other.

Would sharing the code for instance be more helpful in identifying the
problem?

Best,

Tahir

On Thu, Oct 1, 2015 at 8:58 AM, Gabriel Reid <ga...@gmail.com> wrote:
>
> If I'm reading that stack trace correctly, CEDoFn is reading from an
> HBase table in its initialize method (probably via a ReadableData)
> instance.
>
> It looks like the HBase instance is kerberized, which will mean that
> TableMapReduceUtil.initCredentials(Job) needs to be called before
> submitting the job.
>
> There was a relatively recent patch added in Crunch (see CRUNCH-536)
> to make it easier to add the call to
> TableMapReduceUtil.initCredentials. If you build a version of Crunch
> with CRUNCH-536 included, you should be able to add the following call
> during the setup of your pipeline:
>
>     pipeline.addPrepareHook(new CrunchControlledJob.Hook() {
>        @Override
>         public void run(MRJob mrJob) throws IOException {
>           TableMapReduceUtil.initCredentials(mrJob.getJob());
>         }
>      });
>
>
> - Gabriel
>
> On Wed, Sep 30, 2015 at 11:17 PM, Tahir Hameed <ta...@gmail.com> wrote:
> > It is HDFS. The setup for both pipelines is the same too.
> >
> >
> >
> > On Wed, Sep 30, 2015 at 10:17 PM, Micah Whitacre <mk...@gmail.com>
> > wrote:
> >>
> >> What is the datastore you are reading from?  HBase? HDFS?  Also is
there
> >> any setup differences between the two pipelines?
> >>
> >> On Wed, Sep 30, 2015 at 3:13 PM, Tahir Hameed <ta...@gmail.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> I am facing a queer problem. I have 2 MR pipelines. One of them is
> >>> working fine. The other is not.
> >>>
> >>> The difference lies in only one of the DoFN functions.
> >>>
> >>>
> >>> The DoFn function which fails is given below:
> >>>
> >>>     public PTable<ImmutableBytesWritable, CE>
> >>> myFunction(PTable<ImmutableBytesWritable, Pair<A, B>> joinedData,
> >>> PTable<String, C> others) {
> >>>
> >>>         ReadableData<Pair<String, C>> readable =
> >>> others.asReadable(false);
> >>>         ParallelDoOptions options = ParallelDoOptions.builder()
> >>>                 .sourceTargets(readable.getSourceTargets())
> >>>                 .build();
> >>>
> >>>         return joinedData
> >>>                 .by(someMapFunction,
> >>> Avros.writables(ImmutableBytesWritable.class))
> >>>                 .groupByKey()
> >>>                 .parallelDo("", new CEDoFN(readable,
> >>> others.getPTableType()),
> >>>
> >>> Avros.tableOf(Avros.writables(ImmutableBytesWritable.class),
> >>> Avros.reflects(CE.class)), options);
> >>>
> >>>     }
> >>>
> >>> The stack trace is as follows :
> >>>
> >>> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> >>> GSSException: No valid credentials provided (Mechanism level: Failed
to find
> >>> any Kerberos tgt)]
> >>>     at
> >>>
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
> >>>     at
> >>>
org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:177)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:815)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$800(RpcClient.java:349)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:943)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:940)
> >>>     at java.security.AccessController.doPrivileged(Native Method)
> >>>     at javax.security.auth.Subject.doAs(Subject.java:415)
> >>>     at
> >>>
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:940)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.writeRequest(RpcClient.java:1094)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$Connection.tracedWriteRequest(RpcClient.java:1061)
> >>>     at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1516)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1724)
> >>>     at
> >>>
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1777)
> >>>     at
> >>>
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:30373)
> >>>     at
> >>>
org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1604)
> >>>     at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:768)
> >>>     at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:766)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
> >>>     at
org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:772)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.prefetchRegionCache(ConnectionManager.java:1254)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1318)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1167)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:294)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:130)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:55)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:201)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)
> >>>     at
> >>>
org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)
> >>>     at
org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:802)
> >>>     at
> >>>
org.apache.crunch.io.hbase.HTableIterator.<init>(HTableIterator.java:47)
> >>>     at
> >>>
org.apache.crunch.io.hbase.HTableIterable.iterator(HTableIterable.java:43)
> >>>     at
> >>>
org.apache.crunch.util.DelegatingReadableData$1.iterator(DelegatingReadableData.java:63)
> >>>     at
com.bol.step.enrichmentdashboard.fn.CEDoFN.initialize(CEDoFN.java:45)
> >>>     at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:71)
> >>>     at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:73)
> >>>     at
> >>>
org.apache.crunch.impl.mr.run.CrunchReducer.setup(CrunchReducer.java:44)
> >>>     at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:168)
> >>>     at
> >>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
> >>>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> >>>     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> >>>     at java.security.AccessController.doPrivileged(Native Method)
> >>>     at javax.security.auth.Subject.doAs(Subject.java:415)
> >>>     at
> >>>
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> >>>     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> >>> Caused by: GSSException: No valid credentials provided (Mechanism
level:
> >>> Failed to find any Kerberos tgt)
> >>>
> >>>
> >>> In the CEDoFunction, the readable is used in the initialization phase
to
> >>> create a HashMap. This is the place where the stack trace error also
points
> >>> to.
> >>>
> >>> In the function which succeeds, the parallelDo is performed directly
on
> >>> the joinedData which is also a PTable, and there are no errors. The
> >>> initialization phases for both functions are exactly the same.
> >>>
> >>> I fail to understand the cause of the errors because the underlying
> >>> implementations for the both PTable and PGroupedTable is the same
because
> >>> both seem to be extending the PCollectionImpl interface.
> >>>
> >>> Tahir
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>
> >

Re: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

Posted by Gabriel Reid <ga...@gmail.com>.
If I'm reading that stack trace correctly, CEDoFn is reading from an
HBase table in its initialize method (probably via a ReadableData)
instance.

It looks like the HBase instance is kerberized, which will mean that
TableMapReduceUtil.initCredentials(Job) needs to be called before
submitting the job.

There was a relatively recent patch added in Crunch (see CRUNCH-536)
to make it easier to add the call to
TableMapReduceUtil.initCredentials. If you build a version of Crunch
with CRUNCH-536 included, you should be able to add the following call
during the setup of your pipeline:

    pipeline.addPrepareHook(new CrunchControlledJob.Hook() {
       @Override
        public void run(MRJob mrJob) throws IOException {
          TableMapReduceUtil.initCredentials(mrJob.getJob());
        }
     });


- Gabriel

On Wed, Sep 30, 2015 at 11:17 PM, Tahir Hameed <ta...@gmail.com> wrote:
> It is HDFS. The setup for both pipelines is the same too.
>
>
>
> On Wed, Sep 30, 2015 at 10:17 PM, Micah Whitacre <mk...@gmail.com>
> wrote:
>>
>> What is the datastore you are reading from?  HBase? HDFS?  Also is there
>> any setup differences between the two pipelines?
>>
>> On Wed, Sep 30, 2015 at 3:13 PM, Tahir Hameed <ta...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I am facing a queer problem. I have 2 MR pipelines. One of them is
>>> working fine. The other is not.
>>>
>>> The difference lies in only one of the DoFN functions.
>>>
>>>
>>> The DoFn function which fails is given below:
>>>
>>>     public PTable<ImmutableBytesWritable, CE>
>>> myFunction(PTable<ImmutableBytesWritable, Pair<A, B>> joinedData,
>>> PTable<String, C> others) {
>>>
>>>         ReadableData<Pair<String, C>> readable =
>>> others.asReadable(false);
>>>         ParallelDoOptions options = ParallelDoOptions.builder()
>>>                 .sourceTargets(readable.getSourceTargets())
>>>                 .build();
>>>
>>>         return joinedData
>>>                 .by(someMapFunction,
>>> Avros.writables(ImmutableBytesWritable.class))
>>>                 .groupByKey()
>>>                 .parallelDo("", new CEDoFN(readable,
>>> others.getPTableType()),
>>>
>>> Avros.tableOf(Avros.writables(ImmutableBytesWritable.class),
>>> Avros.reflects(CE.class)), options);
>>>
>>>     }
>>>
>>> The stack trace is as follows :
>>>
>>> javax.security.sasl.SaslException: GSS initiate failed [Caused by
>>> GSSException: No valid credentials provided (Mechanism level: Failed to find
>>> any Kerberos tgt)]
>>> 	at
>>> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
>>> 	at
>>> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:177)
>>> 	at
>>> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:815)
>>> 	at
>>> org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$800(RpcClient.java:349)
>>> 	at
>>> org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:943)
>>> 	at
>>> org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:940)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>> 	at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>>> 	at
>>> org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:940)
>>> 	at
>>> org.apache.hadoop.hbase.ipc.RpcClient$Connection.writeRequest(RpcClient.java:1094)
>>> 	at
>>> org.apache.hadoop.hbase.ipc.RpcClient$Connection.tracedWriteRequest(RpcClient.java:1061)
>>> 	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1516)
>>> 	at
>>> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1724)
>>> 	at
>>> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1777)
>>> 	at
>>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:30373)
>>> 	at
>>> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1604)
>>> 	at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:768)
>>> 	at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:766)
>>> 	at
>>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
>>> 	at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:772)
>>> 	at
>>> org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160)
>>> 	at
>>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.prefetchRegionCache(ConnectionManager.java:1254)
>>> 	at
>>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1318)
>>> 	at
>>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1167)
>>> 	at
>>> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:294)
>>> 	at
>>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:130)
>>> 	at
>>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:55)
>>> 	at
>>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:201)
>>> 	at
>>> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
>>> 	at
>>> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)
>>> 	at
>>> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)
>>> 	at
>>> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)
>>> 	at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:802)
>>> 	at
>>> org.apache.crunch.io.hbase.HTableIterator.<init>(HTableIterator.java:47)
>>> 	at
>>> org.apache.crunch.io.hbase.HTableIterable.iterator(HTableIterable.java:43)
>>> 	at
>>> org.apache.crunch.util.DelegatingReadableData$1.iterator(DelegatingReadableData.java:63)
>>> 	at com.bol.step.enrichmentdashboard.fn.CEDoFN.initialize(CEDoFN.java:45)
>>> 	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:71)
>>> 	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:73)
>>> 	at
>>> org.apache.crunch.impl.mr.run.CrunchReducer.setup(CrunchReducer.java:44)
>>> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:168)
>>> 	at
>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>>> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>>> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>> 	at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>>> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>>> Caused by: GSSException: No valid credentials provided (Mechanism level:
>>> Failed to find any Kerberos tgt)
>>>
>>>
>>> In the CEDoFunction, the readable is used in the initialization phase to
>>> create a HashMap. This is the place where the stack trace error also points
>>> to.
>>>
>>> In the function which succeeds, the parallelDo is performed directly on
>>> the joinedData which is also a PTable, and there are no errors. The
>>> initialization phases for both functions are exactly the same.
>>>
>>> I fail to understand the cause of the errors because the underlying
>>> implementations for the both PTable and PGroupedTable is the same because
>>> both seem to be extending the PCollectionImpl interface.
>>>
>>> Tahir
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

Posted by Tahir Hameed <ta...@gmail.com>.
It is HDFS. The setup for both pipelines is the same too.



On Wed, Sep 30, 2015 at 10:17 PM, Micah Whitacre <mk...@gmail.com>
wrote:

> What is the datastore you are reading from?  HBase? HDFS?  Also is there
> any setup differences between the two pipelines?
>
> On Wed, Sep 30, 2015 at 3:13 PM, Tahir Hameed <ta...@gmail.com> wrote:
>
>> Hi,
>>
>> I am facing a queer problem. I have 2 MR pipelines. One of them is
>> working fine. The other is not.
>>
>> The difference lies in only one of the DoFN functions.
>>
>>
>> The DoFn function which fails is given below:
>>
>>     public PTable<ImmutableBytesWritable, CE>
>> myFunction(PTable<ImmutableBytesWritable, Pair<A, B>> joinedData,
>> PTable<String, C> others) {
>>
>>         ReadableData<Pair<String, C>> readable = others.asReadable(false);
>>         ParallelDoOptions options = ParallelDoOptions.builder()
>>                 .sourceTargets(readable.getSourceTargets())
>>                 .build();
>>
>>         return joinedData
>>                 .by(someMapFunction,
>> Avros.writables(ImmutableBytesWritable.class))
>>                 .groupByKey()
>>                 .parallelDo("", new CEDoFN(readable,
>> others.getPTableType()),
>>
>> Avros.tableOf(Avros.writables(ImmutableBytesWritable.class),
>> Avros.reflects(CE.class)), options);
>>
>>     }
>>
>> The stack trace is as follows :
>>
>> javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
>> 	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
>> 	at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:177)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:815)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$800(RpcClient.java:349)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:943)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:940)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:940)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.writeRequest(RpcClient.java:1094)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.tracedWriteRequest(RpcClient.java:1061)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1516)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1724)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1777)
>> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:30373)
>> 	at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1604)
>> 	at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:768)
>> 	at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:766)
>> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
>> 	at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:772)
>> 	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160)
>> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.prefetchRegionCache(ConnectionManager.java:1254)
>> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1318)
>> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1167)
>> 	at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:294)
>> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:130)
>> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:55)
>> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:201)
>> 	at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
>> 	at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)
>> 	at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)
>> 	at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)
>> 	at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:802)
>> 	at org.apache.crunch.io.hbase.HTableIterator.<init>(HTableIterator.java:47)
>> 	at org.apache.crunch.io.hbase.HTableIterable.iterator(HTableIterable.java:43)
>> 	at org.apache.crunch.util.DelegatingReadableData$1.iterator(DelegatingReadableData.java:63)
>> 	at com.bol.step.enrichmentdashboard.fn.CEDoFN.initialize(CEDoFN.java:45)
>> 	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:71)
>> 	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:73)
>> 	at org.apache.crunch.impl.mr.run.CrunchReducer.setup(CrunchReducer.java:44)
>> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:168)
>> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
>> Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
>>
>>
>> In the CEDoFunction, the readable is used in the initialization phase to
>> create a HashMap. This is the place where the stack trace error also points
>> to.
>>
>> In the function which succeeds, the parallelDo is performed directly on
>> the joinedData which is also a PTable, and there are no errors. The
>> initialization phases for both functions are exactly the same.
>>
>> I fail to understand the cause of the errors because the underlying
>> implementations for the both PTable and PGroupedTable is the same because
>> both seem to be extending the PCollectionImpl interface.
>>
>> Tahir
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)

Posted by Micah Whitacre <mk...@gmail.com>.
What is the datastore you are reading from?  HBase? HDFS?  Also is there
any setup differences between the two pipelines?

On Wed, Sep 30, 2015 at 3:13 PM, Tahir Hameed <ta...@gmail.com> wrote:

> Hi,
>
> I am facing a queer problem. I have 2 MR pipelines. One of them is working
> fine. The other is not.
>
> The difference lies in only one of the DoFN functions.
>
>
> The DoFn function which fails is given below:
>
>     public PTable<ImmutableBytesWritable, CE>
> myFunction(PTable<ImmutableBytesWritable, Pair<A, B>> joinedData,
> PTable<String, C> others) {
>
>         ReadableData<Pair<String, C>> readable = others.asReadable(false);
>         ParallelDoOptions options = ParallelDoOptions.builder()
>                 .sourceTargets(readable.getSourceTargets())
>                 .build();
>
>         return joinedData
>                 .by(someMapFunction,
> Avros.writables(ImmutableBytesWritable.class))
>                 .groupByKey()
>                 .parallelDo("", new CEDoFN(readable,
> others.getPTableType()),
>
> Avros.tableOf(Avros.writables(ImmutableBytesWritable.class),
> Avros.reflects(CE.class)), options);
>
>     }
>
> The stack trace is as follows :
>
> javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
> 	at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
> 	at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:177)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupSaslConnection(RpcClient.java:815)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.access$800(RpcClient.java:349)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:943)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection$2.run(RpcClient.java:940)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.setupIOstreams(RpcClient.java:940)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.writeRequest(RpcClient.java:1094)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.tracedWriteRequest(RpcClient.java:1061)
> 	at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1516)
> 	at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1724)
> 	at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1777)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.get(ClientProtos.java:30373)
> 	at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRowOrBefore(ProtobufUtil.java:1604)
> 	at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:768)
> 	at org.apache.hadoop.hbase.client.HTable$2.call(HTable.java:766)
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
> 	at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:772)
> 	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:160)
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.prefetchRegionCache(ConnectionManager.java:1254)
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1318)
> 	at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1167)
> 	at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:294)
> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:130)
> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:55)
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:201)
> 	at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:288)
> 	at org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:268)
> 	at org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:140)
> 	at org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:135)
> 	at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:802)
> 	at org.apache.crunch.io.hbase.HTableIterator.<init>(HTableIterator.java:47)
> 	at org.apache.crunch.io.hbase.HTableIterable.iterator(HTableIterable.java:43)
> 	at org.apache.crunch.util.DelegatingReadableData$1.iterator(DelegatingReadableData.java:63)
> 	at com.bol.step.enrichmentdashboard.fn.CEDoFN.initialize(CEDoFN.java:45)
> 	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:71)
> 	at org.apache.crunch.impl.mr.run.RTNode.initialize(RTNode.java:73)
> 	at org.apache.crunch.impl.mr.run.CrunchReducer.setup(CrunchReducer.java:44)
> 	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:168)
> 	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
> 	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)
>
>
> In the CEDoFunction, the readable is used in the initialization phase to
> create a HashMap. This is the place where the stack trace error also points
> to.
>
> In the function which succeeds, the parallelDo is performed directly on
> the joinedData which is also a PTable, and there are no errors. The
> initialization phases for both functions are exactly the same.
>
> I fail to understand the cause of the errors because the underlying
> implementations for the both PTable and PGroupedTable is the same because
> both seem to be extending the PCollectionImpl interface.
>
> Tahir
>
>
>
>
>
>
>
>
>