You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Raja.Aravapalli" <Ra...@target.com> on 2017/08/17 15:24:58 UTC

Fink application failing with kerberos issue after running successfully without any issues for few days

Hi Ted,

Find below the configuration I see in yarn-site.xml

<property>
      <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
      <value>true</value>
    </property>


Regards,
Raja.


From: Ted Yu <yu...@gmail.com>
Date: Wednesday, August 16, 2017 at 9:05 PM
To: Raja Aravapalli <Ra...@target.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>
Subject: [EXTERNAL] Re: hadoop

Can you check the following config in yarn-site.xml ?

yarn.resourcemanager.proxy-user-privileges.enabled (true)

Cheers

On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <Ra...@target.com>> wrote:

Hi,

I triggered an flink yarn-session on a running Hadoop cluster… and triggering streaming application on that.

But, I see after few days of running without any issues, the flink application which is writing data to hdfs failing with below exception.

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token xxxxxx for xxxxxx) can't be found in cache


Can someone please help me how I can fix this. Thanks a lot.



Regards,
Raja.


Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by "Raja.Aravapalli" <Ra...@target.com>.
Thanks Gordon.


Regards,
Raja.

From: "Tzu-Li (Gordon) Tai" <tz...@apache.org>
Date: Thursday, August 17, 2017 at 11:47 PM
To: Raja Aravapalli <Ra...@target.com>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Hi Raja,

Can you please confirm if I have to use the below settings to ensure I use keytabs?


  *   security.kerberos.login.use-ticket-cache:
Indicates whether to read from your Kerberos ticket cache (default: true).


  *   security.kerberos.login.keytab:
Absolute path to a Kerberos keytab file that contains the user credentials.


  *   security.kerberos.login.principal:
Kerberos principal name associated with the keytab.


  *   security.kerberos.login.contexts: A comma-separated list of login contexts to provide the Kerberos credentials to (for example, Client,KafkaClient to use the credentials for ZooKeeper authentication and for Kafka authentication).

Yes, these are the exact configs that you’ll need to set.

Also a quick question, once I make these changes to use keytabs instead of ticket cache, Is there any place in the logs I can check, were the setting I made are in use and the applications are not actually using again ticket cache again?
You should be able to find logs such as “Adding keytab <keytab path> to the AM container …” at the beginning of the job submission.

Cheers,
Gordon

On 18 August 2017 at 5:51:57 AM, Raja.Aravapalli (raja.aravapalli@target.com<ma...@target.com>) wrote:

Thanks a lot Eron…

If I am understanding you correct, you suggest using keytabs to launch streaming applications!

Can you please confirm if I have to use the below settings to ensure I use keytabs?


  *   security.kerberos.login.use-ticket-cache:
Indicates whether to read from your Kerberos ticket cache (default: true).


  *   security.kerberos.login.keytab:
Absolute path to a Kerberos keytab file that contains the user credentials.


  *   security.kerberos.login.principal:
Kerberos principal name associated with the keytab.


  *   security.kerberos.login.contexts: A comma-separated list of login contexts to provide the Kerberos credentials to (for example, Client,KafkaClient to use the credentials for ZooKeeper authentication and for Kafka authentication).

https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/config.html#kerberos-based-security-1


Also a quick question, once I make these changes to use keytabs instead of ticket cache, Is there any place in the logs I can check, were the setting I made are in use and the applications are not actually using again ticket cache again?

Thanks a lot, in advance.


Regards,
Raja.

From: Eron Wright <er...@gmail.com>
Date: Thursday, August 17, 2017 at 1:06 PM
To: Ted Yu <yu...@gmail.com>
Cc: Raja Aravapalli <Ra...@target.com>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Raja,
According to those configuration values, the delegation token would be automatically renewed every 24 hours, then expire entirely after 7 days.   You say that the job ran without issue for 'a few days'.  Can we conclude that the job hit the 7-day DT expiration?

Flink supports the use of Kerberos keytabs as an alternative to delegation tokens for exactly this reason, that delegation tokens eventually expire and so aren't useful to a long-running program.   Consider making use of keytabs here.

Hope this helps!
-Eron


On Thu, Aug 17, 2017 at 9:58 AM, Ted Yu <yu...@gmail.com>> wrote:
I think this needs to be done by the admin.

On Thu, Aug 17, 2017 at 9:37 AM, Raja.Aravapalli <Ra...@target.com>> wrote:

I don’t have access to the site.xml files, it is controlled by a support team.

Does flink has any configuration settings or api’s thru which we can control this ?


Regards,
Raja.

From: Ted Yu <yu...@gmail.com>>
Date: Thursday, August 17, 2017 at 11:07 AM
To: Raja Aravapalli <Ra...@target.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Can you try shortening renewal interval to something like 28800000 ?

Cheers

On Thu, Aug 17, 2017 at 8:58 AM, Raja.Aravapalli <Ra...@target.com>> wrote:
Hi Ted,

Below is what I see in the environment:

dfs.namenode.delegation.token.max-lifetime:          604800000
dfs.namenode.delegation.token.renew-interval:      86400000


Thanks.


Regards,
Raja.

From: Ted Yu <yu...@gmail.com>>
Date: Thursday, August 17, 2017 at 10:46 AM
To: Raja Aravapalli <Ra...@target.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

What are the values for the following parameters ?

dfs.namenode.delegation.token.max-lifetime

dfs.namenode.delegation.token.renew-interval

Cheers

On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <Ra...@target.com>> wrote:
Hi Ted,

Find below the configuration I see in yarn-site.xml

<property>
      <name>yarn.resourcemanager.pro<http://yarn.resourcemanager.pro>xy-user-privileges.enabled</name>
      <value>true</value>
    </property>


Regards,
Raja.


From: Ted Yu <yu...@gmail.com>>
Date: Wednesday, August 16, 2017 at 9:05 PM
To: Raja Aravapalli <Ra...@target.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: [EXTERNAL] Re: hadoop

Can you check the following config in yarn-site.xml ?

yarn.resourcemanager.proxy-user-privileges.enabled (true)

Cheers

On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <Ra...@target.com>> wrote:

Hi,

I triggered an flink yarn-session on a running Hadoop cluster… and triggering streaming application on that.

But, I see after few days of running without any issues, the flink application which is writing data to hdfs failing with below exception.

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token xxxxxx for xxxxxx) can't be found in cache


Can someone please help me how I can fix this. Thanks a lot.



Regards,
Raja.






Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by "Tzu-Li (Gordon) Tai" <tz...@apache.org>.
Hi Raja,

Can you please confirm if I have to use the below settings to ensure I use keytabs?

 

security.kerberos.login.use-ticket-cache:
Indicates whether to read from your Kerberos ticket cache (default: true).

 

security.kerberos.login.keytab:
Absolute path to a Kerberos keytab file that contains the user credentials.

 

security.kerberos.login.principal:
Kerberos principal name associated with the keytab.

 

security.kerberos.login.contexts: A comma-separated list of login contexts to provide the Kerberos credentials to (for example, Client,KafkaClient to use the credentials for ZooKeeper authentication and for Kafka authentication).
Yes, these are the exact configs that you’ll need to set.


Also a quick question, once I make these changes to use keytabs instead of ticket cache, Is there any place in the logs I can check, were the setting I made are in use and the applications are not actually using again ticket cache again?

You should be able to find logs such as “Adding keytab <keytab path> to the AM container …” at the beginning of the job submission.


Cheers,
Gordon
On 18 August 2017 at 5:51:57 AM, Raja.Aravapalli (raja.aravapalli@target.com) wrote:

 

Thanks a lot Eron…

 

If I am understanding you correct, you suggest using keytabs to launch streaming applications!

 

Can you please confirm if I have to use the below settings to ensure I use keytabs?

 

security.kerberos.login.use-ticket-cache:
Indicates whether to read from your Kerberos ticket cache (default: true).

 

security.kerberos.login.keytab:
Absolute path to a Kerberos keytab file that contains the user credentials.

 

security.kerberos.login.principal:
Kerberos principal name associated with the keytab.

 

security.kerberos.login.contexts: A comma-separated list of login contexts to provide the Kerberos credentials to (for example, Client,KafkaClient to use the credentials for ZooKeeper authentication and for Kafka authentication).
 

https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/config.html#kerberos-based-security-1

 

 

Also a quick question, once I make these changes to use keytabs instead of ticket cache, Is there any place in the logs I can check, were the setting I made are in use and the applications are not actually using again ticket cache again?

 

Thanks a lot, in advance.

 

 

Regards,

Raja.

 

From: Eron Wright <er...@gmail.com>
Date: Thursday, August 17, 2017 at 1:06 PM
To: Ted Yu <yu...@gmail.com>
Cc: Raja Aravapalli <Ra...@target.com>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

 

Raja,

According to those configuration values, the delegation token would be automatically renewed every 24 hours, then expire entirely after 7 days.   You say that the job ran without issue for 'a few days'.  Can we conclude that the job hit the 7-day DT expiration?

 

Flink supports the use of Kerberos keytabs as an alternative to delegation tokens for exactly this reason, that delegation tokens eventually expire and so aren't useful to a long-running program.   Consider making use of keytabs here.

 

Hope this helps!

-Eron

 

 

On Thu, Aug 17, 2017 at 9:58 AM, Ted Yu <yu...@gmail.com> wrote:

I think this needs to be done by the admin.

 

On Thu, Aug 17, 2017 at 9:37 AM, Raja.Aravapalli <Ra...@target.com> wrote:

 

I don’t have access to the site.xml files, it is controlled by a support team.

 

Does flink has any configuration settings or api’s thru which we can control this ?

 

 

Regards,

Raja.

 

From: Ted Yu <yu...@gmail.com>
Date: Thursday, August 17, 2017 at 11:07 AM
To: Raja Aravapalli <Ra...@target.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

 

Can you try shortening renewal interval to something like 28800000 ?

 

Cheers

 

On Thu, Aug 17, 2017 at 8:58 AM, Raja.Aravapalli <Ra...@target.com> wrote:

Hi Ted,

 

Below is what I see in the environment:

 

dfs.namenode.delegation.token.max-lifetime:          604800000

dfs.namenode.delegation.token.renew-interval:      86400000

 

 

Thanks.

 

 

Regards,

Raja.

 

From: Ted Yu <yu...@gmail.com>
Date: Thursday, August 17, 2017 at 10:46 AM
To: Raja Aravapalli <Ra...@target.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>
Subject: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

 

What are the values for the following parameters ?

 

dfs.namenode.delegation.token.max-lifetime

 

dfs.namenode.delegation.token.renew-interval

 

Cheers

 

On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <Ra...@target.com> wrote:

Hi Ted,

 

Find below the configuration I see in yarn-site.xml

 

<property>

      <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>

      <value>true</value>

    </property>

 

 

Regards,

Raja.

 

 

From: Ted Yu <yu...@gmail.com>
Date: Wednesday, August 16, 2017 at 9:05 PM
To: Raja Aravapalli <Ra...@target.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>
Subject: [EXTERNAL] Re: hadoop

 

Can you check the following config in yarn-site.xml ?

 

yarn.resourcemanager.proxy-user-privileges.enabled (true)

 

Cheers

 

On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <Ra...@target.com> wrote:

 

Hi,

 

I triggered an flink yarn-session on a running Hadoop cluster… and triggering streaming application on that.

 

But, I see after few days of running without any issues, the flink application which is writing data to hdfs failing with below exception.

 

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token xxxxxx for xxxxxx) can't be found in cache

 

 

Can someone please help me how I can fix this. Thanks a lot.

 

 

 

Regards,

Raja.

 

 

 

 

 

Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by "Raja.Aravapalli" <Ra...@target.com>.
Thanks a lot Eron…

If I am understanding you correct, you suggest using keytabs to launch streaming applications!

Can you please confirm if I have to use the below settings to ensure I use keytabs?


  *   security.kerberos.login.use-ticket-cache:
Indicates whether to read from your Kerberos ticket cache (default: true).


  *   security.kerberos.login.keytab:
Absolute path to a Kerberos keytab file that contains the user credentials.


  *   security.kerberos.login.principal:
Kerberos principal name associated with the keytab.


  *   security.kerberos.login.contexts: A comma-separated list of login contexts to provide the Kerberos credentials to (for example, Client,KafkaClient to use the credentials for ZooKeeper authentication and for Kafka authentication).

https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/config.html#kerberos-based-security-1


Also a quick question, once I make these changes to use keytabs instead of ticket cache, Is there any place in the logs I can check, were the setting I made are in use and the applications are not actually using again ticket cache again?

Thanks a lot, in advance.


Regards,
Raja.

From: Eron Wright <er...@gmail.com>
Date: Thursday, August 17, 2017 at 1:06 PM
To: Ted Yu <yu...@gmail.com>
Cc: Raja Aravapalli <Ra...@target.com>, "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Raja,
According to those configuration values, the delegation token would be automatically renewed every 24 hours, then expire entirely after 7 days.   You say that the job ran without issue for 'a few days'.  Can we conclude that the job hit the 7-day DT expiration?

Flink supports the use of Kerberos keytabs as an alternative to delegation tokens for exactly this reason, that delegation tokens eventually expire and so aren't useful to a long-running program.   Consider making use of keytabs here.

Hope this helps!
-Eron


On Thu, Aug 17, 2017 at 9:58 AM, Ted Yu <yu...@gmail.com>> wrote:
I think this needs to be done by the admin.

On Thu, Aug 17, 2017 at 9:37 AM, Raja.Aravapalli <Ra...@target.com>> wrote:

I don’t have access to the site.xml files, it is controlled by a support team.

Does flink has any configuration settings or api’s thru which we can control this ?


Regards,
Raja.

From: Ted Yu <yu...@gmail.com>>
Date: Thursday, August 17, 2017 at 11:07 AM
To: Raja Aravapalli <Ra...@target.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Can you try shortening renewal interval to something like 28800000 ?

Cheers

On Thu, Aug 17, 2017 at 8:58 AM, Raja.Aravapalli <Ra...@target.com>> wrote:
Hi Ted,

Below is what I see in the environment:

dfs.namenode.delegation.token.max-lifetime:          604800000
dfs.namenode.delegation.token.renew-interval:      86400000


Thanks.


Regards,
Raja.

From: Ted Yu <yu...@gmail.com>>
Date: Thursday, August 17, 2017 at 10:46 AM
To: Raja Aravapalli <Ra...@target.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

What are the values for the following parameters ?

dfs.namenode.delegation.token.max-lifetime

dfs.namenode.delegation.token.renew-interval

Cheers

On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <Ra...@target.com>> wrote:
Hi Ted,

Find below the configuration I see in yarn-site.xml

<property>
      <name>yarn.resourcemanager.pro<http://yarn.resourcemanager.pro>xy-user-privileges.enabled</name>
      <value>true</value>
    </property>


Regards,
Raja.


From: Ted Yu <yu...@gmail.com>>
Date: Wednesday, August 16, 2017 at 9:05 PM
To: Raja Aravapalli <Ra...@target.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: [EXTERNAL] Re: hadoop

Can you check the following config in yarn-site.xml ?

yarn.resourcemanager.proxy-user-privileges.enabled (true)

Cheers

On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <Ra...@target.com>> wrote:

Hi,

I triggered an flink yarn-session on a running Hadoop cluster… and triggering streaming application on that.

But, I see after few days of running without any issues, the flink application which is writing data to hdfs failing with below exception.

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token xxxxxx for xxxxxx) can't be found in cache


Can someone please help me how I can fix this. Thanks a lot.



Regards,
Raja.






Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by Prabhu V <vp...@gmail.com>.
+1 on the 7 day expiry explanation,

This is most likely the cause.

I faced the 7 day expiry issue with a previous version of flink that dint
support keytabs, I am currently running flink-1.3 with keytabs (it has been
going okay for 2 days now), I will update after the 7 day mark.

Thanks,
Prabhu

On Thu, Aug 17, 2017 at 11:06 AM, Eron Wright <er...@gmail.com> wrote:

> Raja,
> According to those configuration values, the delegation token would be
> automatically renewed every 24 hours, then expire entirely after 7 days.
> You say that the job ran without issue for 'a few days'.  Can we conclude
> that the job hit the 7-day DT expiration?
>
> Flink supports the use of Kerberos keytabs as an alternative to delegation
> tokens for exactly this reason, that delegation tokens eventually expire
> and so aren't useful to a long-running program.   Consider making use of
> keytabs here.
>
> Hope this helps!
> -Eron
>
>
> On Thu, Aug 17, 2017 at 9:58 AM, Ted Yu <yu...@gmail.com> wrote:
>
>> I think this needs to be done by the admin.
>>
>> On Thu, Aug 17, 2017 at 9:37 AM, Raja.Aravapalli <
>> Raja.Aravapalli@target.com> wrote:
>>
>>>
>>>
>>> I don’t have access to the site.xml files, it is controlled by a support
>>> team.
>>>
>>>
>>>
>>> Does flink has any configuration settings or api’s thru which we can
>>> control this ?
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Raja.
>>>
>>>
>>>
>>> *From: *Ted Yu <yu...@gmail.com>
>>> *Date: *Thursday, August 17, 2017 at 11:07 AM
>>> *To: *Raja Aravapalli <Ra...@target.com>
>>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
>>> *Subject: *Re: [EXTERNAL] Re: Fink application failing with kerberos
>>> issue after running successfully without any issues for few days
>>>
>>>
>>>
>>> Can you try shortening renewal interval to something like 28800000 ?
>>>
>>>
>>>
>>> Cheers
>>>
>>>
>>>
>>> On Thu, Aug 17, 2017 at 8:58 AM, Raja.Aravapalli <
>>> Raja.Aravapalli@target.com> wrote:
>>>
>>> Hi Ted,
>>>
>>>
>>>
>>> Below is what I see in the environment:
>>>
>>>
>>>
>>> dfs.namenode.delegation.token.max-lifetime:          *604800000*
>>>
>>> dfs.namenode.delegation.token.renew-interval:      *86400000*
>>>
>>>
>>>
>>>
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Raja.
>>>
>>>
>>>
>>> *From: *Ted Yu <yu...@gmail.com>
>>> *Date: *Thursday, August 17, 2017 at 10:46 AM
>>> *To: *Raja Aravapalli <Ra...@target.com>
>>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
>>> *Subject: *[EXTERNAL] Re: Fink application failing with kerberos issue
>>> after running successfully without any issues for few days
>>>
>>>
>>>
>>> What are the values for the following parameters ?
>>>
>>>
>>>
>>> dfs.namenode.delegation.token.max-lifetime
>>>
>>>
>>>
>>> dfs.namenode.delegation.token.renew-interval
>>>
>>>
>>>
>>> Cheers
>>>
>>>
>>>
>>> On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <
>>> Raja.Aravapalli@target.com> wrote:
>>>
>>> Hi Ted,
>>>
>>>
>>>
>>> Find below the configuration I see in yarn-site.xml
>>>
>>>
>>>
>>> <property>
>>>
>>>       <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
>>>
>>>       <value>true</value>
>>>
>>>     </property>
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Raja.
>>>
>>>
>>>
>>>
>>>
>>> *From: *Ted Yu <yu...@gmail.com>
>>> *Date: *Wednesday, August 16, 2017 at 9:05 PM
>>> *To: *Raja Aravapalli <Ra...@target.com>
>>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
>>> *Subject: *[EXTERNAL] Re: hadoop
>>>
>>>
>>>
>>> Can you check the following config in yarn-site.xml ?
>>>
>>>
>>>
>>> yarn.resourcemanager.proxy-user-privileges.enabled (true)
>>>
>>>
>>>
>>> Cheers
>>>
>>>
>>>
>>> On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <
>>> Raja.Aravapalli@target.com> wrote:
>>>
>>>
>>>
>>> Hi,
>>>
>>>
>>>
>>> I triggered an flink yarn-session on a running Hadoop cluster… and
>>> triggering streaming application on that.
>>>
>>>
>>>
>>> But, I see after few days of running without any issues, the flink
>>> application which is writing data to hdfs failing with below exception.
>>>
>>>
>>>
>>> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.secu
>>> rity.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN
>>> token xxxxxx for xxxxxx) can't be found in cache
>>>
>>>
>>>
>>>
>>>
>>> Can someone please help me how I can fix this. Thanks a lot.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>>
>>> Raja.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by Eron Wright <er...@gmail.com>.
Raja,
According to those configuration values, the delegation token would be
automatically renewed every 24 hours, then expire entirely after 7 days.
You say that the job ran without issue for 'a few days'.  Can we conclude
that the job hit the 7-day DT expiration?

Flink supports the use of Kerberos keytabs as an alternative to delegation
tokens for exactly this reason, that delegation tokens eventually expire
and so aren't useful to a long-running program.   Consider making use of
keytabs here.

Hope this helps!
-Eron


On Thu, Aug 17, 2017 at 9:58 AM, Ted Yu <yu...@gmail.com> wrote:

> I think this needs to be done by the admin.
>
> On Thu, Aug 17, 2017 at 9:37 AM, Raja.Aravapalli <
> Raja.Aravapalli@target.com> wrote:
>
>>
>>
>> I don’t have access to the site.xml files, it is controlled by a support
>> team.
>>
>>
>>
>> Does flink has any configuration settings or api’s thru which we can
>> control this ?
>>
>>
>>
>>
>>
>> Regards,
>>
>> Raja.
>>
>>
>>
>> *From: *Ted Yu <yu...@gmail.com>
>> *Date: *Thursday, August 17, 2017 at 11:07 AM
>> *To: *Raja Aravapalli <Ra...@target.com>
>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
>> *Subject: *Re: [EXTERNAL] Re: Fink application failing with kerberos
>> issue after running successfully without any issues for few days
>>
>>
>>
>> Can you try shortening renewal interval to something like 28800000 ?
>>
>>
>>
>> Cheers
>>
>>
>>
>> On Thu, Aug 17, 2017 at 8:58 AM, Raja.Aravapalli <
>> Raja.Aravapalli@target.com> wrote:
>>
>> Hi Ted,
>>
>>
>>
>> Below is what I see in the environment:
>>
>>
>>
>> dfs.namenode.delegation.token.max-lifetime:          *604800000*
>>
>> dfs.namenode.delegation.token.renew-interval:      *86400000*
>>
>>
>>
>>
>>
>> Thanks.
>>
>>
>>
>>
>>
>> Regards,
>>
>> Raja.
>>
>>
>>
>> *From: *Ted Yu <yu...@gmail.com>
>> *Date: *Thursday, August 17, 2017 at 10:46 AM
>> *To: *Raja Aravapalli <Ra...@target.com>
>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
>> *Subject: *[EXTERNAL] Re: Fink application failing with kerberos issue
>> after running successfully without any issues for few days
>>
>>
>>
>> What are the values for the following parameters ?
>>
>>
>>
>> dfs.namenode.delegation.token.max-lifetime
>>
>>
>>
>> dfs.namenode.delegation.token.renew-interval
>>
>>
>>
>> Cheers
>>
>>
>>
>> On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <
>> Raja.Aravapalli@target.com> wrote:
>>
>> Hi Ted,
>>
>>
>>
>> Find below the configuration I see in yarn-site.xml
>>
>>
>>
>> <property>
>>
>>       <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
>>
>>       <value>true</value>
>>
>>     </property>
>>
>>
>>
>>
>>
>> Regards,
>>
>> Raja.
>>
>>
>>
>>
>>
>> *From: *Ted Yu <yu...@gmail.com>
>> *Date: *Wednesday, August 16, 2017 at 9:05 PM
>> *To: *Raja Aravapalli <Ra...@target.com>
>> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
>> *Subject: *[EXTERNAL] Re: hadoop
>>
>>
>>
>> Can you check the following config in yarn-site.xml ?
>>
>>
>>
>> yarn.resourcemanager.proxy-user-privileges.enabled (true)
>>
>>
>>
>> Cheers
>>
>>
>>
>> On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <
>> Raja.Aravapalli@target.com> wrote:
>>
>>
>>
>> Hi,
>>
>>
>>
>> I triggered an flink yarn-session on a running Hadoop cluster… and
>> triggering streaming application on that.
>>
>>
>>
>> But, I see after few days of running without any issues, the flink
>> application which is writing data to hdfs failing with below exception.
>>
>>
>>
>> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.
>> security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN
>> token xxxxxx for xxxxxx) can't be found in cache
>>
>>
>>
>>
>>
>> Can someone please help me how I can fix this. Thanks a lot.
>>
>>
>>
>>
>>
>>
>>
>> Regards,
>>
>> Raja.
>>
>>
>>
>>
>>
>>
>>
>
>

Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by Ted Yu <yu...@gmail.com>.
I think this needs to be done by the admin.

On Thu, Aug 17, 2017 at 9:37 AM, Raja.Aravapalli <Raja.Aravapalli@target.com
> wrote:

>
>
> I don’t have access to the site.xml files, it is controlled by a support
> team.
>
>
>
> Does flink has any configuration settings or api’s thru which we can
> control this ?
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
> *From: *Ted Yu <yu...@gmail.com>
> *Date: *Thursday, August 17, 2017 at 11:07 AM
> *To: *Raja Aravapalli <Ra...@target.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *Re: [EXTERNAL] Re: Fink application failing with kerberos
> issue after running successfully without any issues for few days
>
>
>
> Can you try shortening renewal interval to something like 28800000 ?
>
>
>
> Cheers
>
>
>
> On Thu, Aug 17, 2017 at 8:58 AM, Raja.Aravapalli <
> Raja.Aravapalli@target.com> wrote:
>
> Hi Ted,
>
>
>
> Below is what I see in the environment:
>
>
>
> dfs.namenode.delegation.token.max-lifetime:          *604800000*
>
> dfs.namenode.delegation.token.renew-interval:      *86400000*
>
>
>
>
>
> Thanks.
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
> *From: *Ted Yu <yu...@gmail.com>
> *Date: *Thursday, August 17, 2017 at 10:46 AM
> *To: *Raja Aravapalli <Ra...@target.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *[EXTERNAL] Re: Fink application failing with kerberos issue
> after running successfully without any issues for few days
>
>
>
> What are the values for the following parameters ?
>
>
>
> dfs.namenode.delegation.token.max-lifetime
>
>
>
> dfs.namenode.delegation.token.renew-interval
>
>
>
> Cheers
>
>
>
> On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <
> Raja.Aravapalli@target.com> wrote:
>
> Hi Ted,
>
>
>
> Find below the configuration I see in yarn-site.xml
>
>
>
> <property>
>
>       <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
>
>       <value>true</value>
>
>     </property>
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
>
>
> *From: *Ted Yu <yu...@gmail.com>
> *Date: *Wednesday, August 16, 2017 at 9:05 PM
> *To: *Raja Aravapalli <Ra...@target.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *[EXTERNAL] Re: hadoop
>
>
>
> Can you check the following config in yarn-site.xml ?
>
>
>
> yarn.resourcemanager.proxy-user-privileges.enabled (true)
>
>
>
> Cheers
>
>
>
> On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <
> Raja.Aravapalli@target.com> wrote:
>
>
>
> Hi,
>
>
>
> I triggered an flink yarn-session on a running Hadoop cluster… and
> triggering streaming application on that.
>
>
>
> But, I see after few days of running without any issues, the flink
> application which is writing data to hdfs failing with below exception.
>
>
>
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.
> hadoop.security.token.SecretManager$InvalidToken): token
> (HDFS_DELEGATION_TOKEN token xxxxxx for xxxxxx) can't be found in cache
>
>
>
>
>
> Can someone please help me how I can fix this. Thanks a lot.
>
>
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
>
>
>
>

Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by "Raja.Aravapalli" <Ra...@target.com>.
I don’t have access to the site.xml files, it is controlled by a support team.

Does flink has any configuration settings or api’s thru which we can control this ?


Regards,
Raja.

From: Ted Yu <yu...@gmail.com>
Date: Thursday, August 17, 2017 at 11:07 AM
To: Raja Aravapalli <Ra...@target.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>
Subject: Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Can you try shortening renewal interval to something like 28800000 ?

Cheers

On Thu, Aug 17, 2017 at 8:58 AM, Raja.Aravapalli <Ra...@target.com>> wrote:
Hi Ted,

Below is what I see in the environment:

dfs.namenode.delegation.token.max-lifetime:          604800000
dfs.namenode.delegation.token.renew-interval:      86400000


Thanks.


Regards,
Raja.

From: Ted Yu <yu...@gmail.com>>
Date: Thursday, August 17, 2017 at 10:46 AM
To: Raja Aravapalli <Ra...@target.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

What are the values for the following parameters ?

dfs.namenode.delegation.token.max-lifetime

dfs.namenode.delegation.token.renew-interval

Cheers

On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <Ra...@target.com>> wrote:
Hi Ted,

Find below the configuration I see in yarn-site.xml

<property>
      <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
      <value>true</value>
    </property>


Regards,
Raja.


From: Ted Yu <yu...@gmail.com>>
Date: Wednesday, August 16, 2017 at 9:05 PM
To: Raja Aravapalli <Ra...@target.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: [EXTERNAL] Re: hadoop

Can you check the following config in yarn-site.xml ?

yarn.resourcemanager.proxy-user-privileges.enabled (true)

Cheers

On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <Ra...@target.com>> wrote:

Hi,

I triggered an flink yarn-session on a running Hadoop cluster… and triggering streaming application on that.

But, I see after few days of running without any issues, the flink application which is writing data to hdfs failing with below exception.

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token xxxxxx for xxxxxx) can't be found in cache


Can someone please help me how I can fix this. Thanks a lot.



Regards,
Raja.




Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by Ted Yu <yu...@gmail.com>.
Can you try shortening renewal interval to something like 28800000 ?

Cheers

On Thu, Aug 17, 2017 at 8:58 AM, Raja.Aravapalli <Raja.Aravapalli@target.com
> wrote:

> Hi Ted,
>
>
>
> Below is what I see in the environment:
>
>
>
> dfs.namenode.delegation.token.max-lifetime:          *604800000*
>
> dfs.namenode.delegation.token.renew-interval:      *86400000*
>
>
>
>
>
> Thanks.
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
> *From: *Ted Yu <yu...@gmail.com>
> *Date: *Thursday, August 17, 2017 at 10:46 AM
> *To: *Raja Aravapalli <Ra...@target.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *[EXTERNAL] Re: Fink application failing with kerberos issue
> after running successfully without any issues for few days
>
>
>
> What are the values for the following parameters ?
>
>
>
> dfs.namenode.delegation.token.max-lifetime
>
>
>
> dfs.namenode.delegation.token.renew-interval
>
>
>
> Cheers
>
>
>
> On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <
> Raja.Aravapalli@target.com> wrote:
>
> Hi Ted,
>
>
>
> Find below the configuration I see in yarn-site.xml
>
>
>
> <property>
>
>       <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
>
>       <value>true</value>
>
>     </property>
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
>
>
> *From: *Ted Yu <yu...@gmail.com>
> *Date: *Wednesday, August 16, 2017 at 9:05 PM
> *To: *Raja Aravapalli <Ra...@target.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *[EXTERNAL] Re: hadoop
>
>
>
> Can you check the following config in yarn-site.xml ?
>
>
>
> yarn.resourcemanager.proxy-user-privileges.enabled (true)
>
>
>
> Cheers
>
>
>
> On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <
> Raja.Aravapalli@target.com> wrote:
>
>
>
> Hi,
>
>
>
> I triggered an flink yarn-session on a running Hadoop cluster… and
> triggering streaming application on that.
>
>
>
> But, I see after few days of running without any issues, the flink
> application which is writing data to hdfs failing with below exception.
>
>
>
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.
> hadoop.security.token.SecretManager$InvalidToken): token
> (HDFS_DELEGATION_TOKEN token xxxxxx for xxxxxx) can't be found in cache
>
>
>
>
>
> Can someone please help me how I can fix this. Thanks a lot.
>
>
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
>
>

Re: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by "Raja.Aravapalli" <Ra...@target.com>.
Hi Ted,

Below is what I see in the environment:

dfs.namenode.delegation.token.max-lifetime:          604800000
dfs.namenode.delegation.token.renew-interval:      86400000


Thanks.


Regards,
Raja.

From: Ted Yu <yu...@gmail.com>
Date: Thursday, August 17, 2017 at 10:46 AM
To: Raja Aravapalli <Ra...@target.com>
Cc: "user@flink.apache.org" <us...@flink.apache.org>
Subject: [EXTERNAL] Re: Fink application failing with kerberos issue after running successfully without any issues for few days

What are the values for the following parameters ?

dfs.namenode.delegation.token.max-lifetime

dfs.namenode.delegation.token.renew-interval

Cheers

On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <Ra...@target.com>> wrote:
Hi Ted,

Find below the configuration I see in yarn-site.xml

<property>
      <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
      <value>true</value>
    </property>


Regards,
Raja.


From: Ted Yu <yu...@gmail.com>>
Date: Wednesday, August 16, 2017 at 9:05 PM
To: Raja Aravapalli <Ra...@target.com>>
Cc: "user@flink.apache.org<ma...@flink.apache.org>" <us...@flink.apache.org>>
Subject: [EXTERNAL] Re: hadoop

Can you check the following config in yarn-site.xml ?

yarn.resourcemanager.proxy-user-privileges.enabled (true)

Cheers

On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <Ra...@target.com>> wrote:

Hi,

I triggered an flink yarn-session on a running Hadoop cluster… and triggering streaming application on that.

But, I see after few days of running without any issues, the flink application which is writing data to hdfs failing with below exception.

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): token (HDFS_DELEGATION_TOKEN token xxxxxx for xxxxxx) can't be found in cache


Can someone please help me how I can fix this. Thanks a lot.



Regards,
Raja.



Re: Fink application failing with kerberos issue after running successfully without any issues for few days

Posted by Ted Yu <yu...@gmail.com>.
What are the values for the following parameters ?

dfs.namenode.delegation.token.max-lifetime

dfs.namenode.delegation.token.renew-interval

Cheers

On Thu, Aug 17, 2017 at 8:24 AM, Raja.Aravapalli <Raja.Aravapalli@target.com
> wrote:

> Hi Ted,
>
>
>
> Find below the configuration I see in yarn-site.xml
>
>
>
> <property>
>
>       <name>yarn.resourcemanager.proxy-user-privileges.enabled</name>
>
>       <value>true</value>
>
>     </property>
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>
>
>
> *From: *Ted Yu <yu...@gmail.com>
> *Date: *Wednesday, August 16, 2017 at 9:05 PM
> *To: *Raja Aravapalli <Ra...@target.com>
> *Cc: *"user@flink.apache.org" <us...@flink.apache.org>
> *Subject: *[EXTERNAL] Re: hadoop
>
>
>
> Can you check the following config in yarn-site.xml ?
>
>
>
> yarn.resourcemanager.proxy-user-privileges.enabled (true)
>
>
>
> Cheers
>
>
>
> On Wed, Aug 16, 2017 at 4:48 PM, Raja.Aravapalli <
> Raja.Aravapalli@target.com> wrote:
>
>
>
> Hi,
>
>
>
> I triggered an flink yarn-session on a running Hadoop cluster… and
> triggering streaming application on that.
>
>
>
> But, I see after few days of running without any issues, the flink
> application which is writing data to hdfs failing with below exception.
>
>
>
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.
> hadoop.security.token.SecretManager$InvalidToken): token
> (HDFS_DELEGATION_TOKEN token xxxxxx for xxxxxx) can't be found in cache
>
>
>
>
>
> Can someone please help me how I can fix this. Thanks a lot.
>
>
>
>
>
>
>
> Regards,
>
> Raja.
>
>
>