You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@falcon.apache.org by Margus Roo <ma...@roo.ee> on 2016/02/22 07:58:26 UTC

All running processes are in UNKNOWN status

Hi

I am using Falcon- 0.6.1.2.3 packaged by Hortonworks HDP-2.3

I noticed that all my running processes go after some days in to UNKNOWN 
status. After restarting Falcon they are back in RUNNING status. And 
after some days it is repeating again.

-- 
Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780


Re: All running processes are in UNKNOWN status

Posted by Margus Roo <ma...@roo.ee>.
I think I have the same issue - 
https://issues.apache.org/jira/browse/FALCON-1595
And I have Falcon 0.6. Is there workaround, because I am depending from 
HDP-2.3

Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780

On 23/02/16 09:13, Margus Roo wrote:
> After I restarted falcon yesterday and processes were RUNNING statuses 
> until today morning and now they turned to UNKNOWN again.
> In log I can see:
> 2016-02-23 08:55:16,947 WARN  - [Timer-2:] ~ Exception encountered 
> while connecting to the server :  (Client:680)
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed 
> to find any Kerberos tgt)]
> I can not figure out whose kerberos ticket is missing.
>
> Margus (margusja) Roo
> http://margus.roo.ee
> skype: margusja
> +372 51 48 780
>
> On 22/02/16 09:31, Pallavi Rao wrote:
>> It might not have to do with a particular process. It might go into 
>> UNKNOWN
>> status when Falcon is unable to communicate with Oozie, for example. 
>> What
>> will help in this case is the falcon.application.log (Falcon server 
>> logs).
>>
>> Regards,
>> Pallavi
>>
>> On Mon, Feb 22, 2016 at 12:49 PM, Margus Roo <ma...@roo.ee> wrote:
>>
>>> It is difficult because I have already more than ten processes are 
>>> running
>>> and I do not know exact moment when they are going in to UNKNOWN 
>>> status.
>>> I just hoped that it had happened before and someone in this list have
>>> ideas.
>>> So you think it is related with processes?
>>> Then I can start only one process and then I see is it going to 
>>> UNKNOWN.
>>>
>>> I tried to subscribe to user@ list but no success. In falcon site I can
>>> not find user list subscribe e-mail. If you can provide it I can ask 
>>> help
>>> from user list.
>>>
>>> Margus (margusja) Roo
>>> http://margus.roo.ee
>>> skype: margusja
>>> +372 51 48 780
>>>
>>> On 22/02/16 09:14, Sandeep Samudrala wrote:
>>>
>>>> Hi Margus,
>>>> Please do send such queries over users mailing list. Can you attach 
>>>> your
>>>> process definition and also can you check application.log. Please 
>>>> attach
>>>> any stack trace if any.
>>>>
>>>> Thanks,
>>>> -Sandeep
>>>> On Feb 22, 2016 12:28 PM, "Margus Roo" <ma...@roo.ee> wrote:
>>>>
>>>> Hi
>>>>> I am using Falcon- 0.6.1.2.3 packaged by Hortonworks HDP-2.3
>>>>>
>>>>> I noticed that all my running processes go after some days in to 
>>>>> UNKNOWN
>>>>> status. After restarting Falcon they are back in RUNNING status. And
>>>>> after
>>>>> some days it is repeating again.
>>>>>
>>>>> -- 
>>>>> Margus (margusja) Roo
>>>>> http://margus.roo.ee
>>>>> skype: margusja
>>>>> +372 51 48 780
>>>>>
>>>>>
>>>>>
>


Re: All running processes are in UNKNOWN status

Posted by Margus Roo <ma...@roo.ee>.
After I restarted falcon yesterday and processes were RUNNING statuses 
until today morning and now they turned to UNKNOWN again.
In log I can see:
2016-02-23 08:55:16,947 WARN  - [Timer-2:] ~ Exception encountered while 
connecting to the server :  (Client:680)
javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to 
find any Kerberos tgt)]
I can not figure out whose kerberos ticket is missing.

Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780

On 22/02/16 09:31, Pallavi Rao wrote:
> It might not have to do with a particular process. It might go into UNKNOWN
> status when Falcon is unable to communicate with Oozie, for example. What
> will help in this case is the falcon.application.log (Falcon server logs).
>
> Regards,
> Pallavi
>
> On Mon, Feb 22, 2016 at 12:49 PM, Margus Roo <ma...@roo.ee> wrote:
>
>> It is difficult because I have already more than ten processes are running
>> and I do not know exact moment when they are going in to UNKNOWN status.
>> I just hoped that it had happened before and someone in this list have
>> ideas.
>> So you think it is related with processes?
>> Then I can start only one process and then I see is it going to UNKNOWN.
>>
>> I tried to subscribe to user@ list but no success. In falcon site I can
>> not find user list subscribe e-mail. If you can provide it I can ask help
>> from user list.
>>
>> Margus (margusja) Roo
>> http://margus.roo.ee
>> skype: margusja
>> +372 51 48 780
>>
>> On 22/02/16 09:14, Sandeep Samudrala wrote:
>>
>>> Hi Margus,
>>> Please do send such queries over users mailing list. Can you attach your
>>> process definition and also can you check application.log. Please attach
>>> any stack trace if any.
>>>
>>> Thanks,
>>> -Sandeep
>>> On Feb 22, 2016 12:28 PM, "Margus Roo" <ma...@roo.ee> wrote:
>>>
>>> Hi
>>>> I am using Falcon- 0.6.1.2.3 packaged by Hortonworks HDP-2.3
>>>>
>>>> I noticed that all my running processes go after some days in to UNKNOWN
>>>> status. After restarting Falcon they are back in RUNNING status. And
>>>> after
>>>> some days it is repeating again.
>>>>
>>>> --
>>>> Margus (margusja) Roo
>>>> http://margus.roo.ee
>>>> skype: margusja
>>>> +372 51 48 780
>>>>
>>>>
>>>>


Re: All running processes are in UNKNOWN status

Posted by Sandeep Samudrala <sa...@gmail.com>.
Whats the memory allocated for the Falcon Server ? It could be that falcon
server going out of memory. Can you check if falcon server is throwing Full
GCs ?
If so can you try removing
org.apache.falcon.metadata.MetadataMappingService
from startup.properties and start the falcon server and try?

On Mon, Feb 22, 2016 at 1:09 PM, Margus Roo <ma...@roo.ee> wrote:

> Found rows from log:
>
> 2016-02-22 08:54:48,273 INFO  - [126455586@qtp-525968792-61 -
> 763a5818-27e2-4ada-8d45-50b06afffa8e:margusja:GET//entities/list/feed,process]
> ~ {Action:list, Dimensions:{}, Status: SUCCEEDED, Time-taken:579935193 ns}
> (METRIC:38)
> 2016-02-22 08:54:48,274 DEBUG - [126455586@qtp-525968792-61 -
> 763a5818-27e2-4ada-8d45-50b06afffa8e:] ~ Audit: margusja/10.65.104.39
> performed request
> http://hadoopnn2.estpak.ee:15000/api/entities/list/feed,process?fields=clusters,tags,status&offset=0&numResults=10
> (88.196.164.43) at time 2016-02-22T06:54Z (FalconAuditFilter:86)
> 2016-02-22 08:55:10,388 INFO  - [ActiveMQ ShutdownHook:] ~ ActiveMQ
> Message Broker (localhost, ID:hadoopnn2.estpak.ee-48159-1455867360485-0:1)
> is shutting down (BrokerService:560)
> 2016-02-22 08:55:10,389 INFO  - [ActiveMQ ShutdownHook:] ~ Connector
> vm://localhost Stopped (TransportConnector:288)
> 2016-02-22 08:55:10,652 INFO  - [ActiveMQ Connection Executor: tcp://
> hadoopnn2.estpak.ee/88.196.164.43:61616:] ~ Error in onException for
> topicSubscriber of topic: FALCON.ENTITY.TOPIC (JMSMessageConsumer:144)
> javax.jms.JMSException: java.io.EOFException
>         at
> org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49)
>         at
> org.apache.activemq.ActiveMQConnection.onAsyncException(ActiveMQConnection.java:1833)
>         at
> org.apache.activemq.ActiveMQConnection.onException(ActiveMQConnection.java:1850)
>         at
> org.apache.activemq.transport.TransportFilter.onException(TransportFilter.java:101)
>         at
> org.apache.activemq.transport.ResponseCorrelator.onException(ResponseCorrelator.java:126)
>         at
> org.apache.activemq.transport.TransportFilter.onException(TransportFilter.java:101)
>         at
> org.apache.activemq.transport.TransportFilter.onException(TransportFilter.java:101)
>         at
> org.apache.activemq.transport.WireFormatNegotiator.onException(WireFormatNegotiator.java:160)
>         at
> org.apache.activemq.transport.InactivityMonitor.onException(InactivityMonitor.java:266)
>         at
> org.apache.activemq.transport.TransportSupport.onException(TransportSupport.java:96)
>         at
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:206)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
>         at
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
>         at
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
>         at
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
>         ... 1 more
>
>
> And before that there are loads of kerberos related problems:
> 2016-02-22 08:54:48,272 WARN  - [126455586@qtp-525968792-61 -
> 763a5818-27e2-4ada-8d45-50b06afffa8e:margusja:GET//entities/list/feed,process]
> ~ Exception while invoking class
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo
> over hadoopnn2.estpak.ee/88.196.164.43:8020. Not retrying because
> failovers (15) exceeded maximum allowed (15) (RetryInvocationHandler:121)
> java.io.IOException: Failed on local exception: java.io.IOException:
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]; Host Details : local host is: "
> hadoopnn2.estpak.ee/88.196.164.43"; destination host is: "
> hadoopnn2.estpak.ee":8020;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
>
> But thous kerberos problems will resolve after falcon restart.
>
> Anyway this is not the right list as I understand. Can you provide my user@
> subscription e-mail?
>
> Margus (margusja) Roo
> http://margus.roo.ee
> skype: margusja
> +372 51 48 780
>
> On 22/02/16 09:31, Pallavi Rao wrote:
>
>> It might not have to do with a particular process. It might go into
>> UNKNOWN
>> status when Falcon is unable to communicate with Oozie, for example. What
>> will help in this case is the falcon.application.log (Falcon server logs).
>>
>> Regards,
>> Pallavi
>>
>> On Mon, Feb 22, 2016 at 12:49 PM, Margus Roo <ma...@roo.ee> wrote:
>>
>> It is difficult because I have already more than ten processes are running
>>> and I do not know exact moment when they are going in to UNKNOWN status.
>>> I just hoped that it had happened before and someone in this list have
>>> ideas.
>>> So you think it is related with processes?
>>> Then I can start only one process and then I see is it going to UNKNOWN.
>>>
>>> I tried to subscribe to user@ list but no success. In falcon site I can
>>> not find user list subscribe e-mail. If you can provide it I can ask help
>>> from user list.
>>>
>>> Margus (margusja) Roo
>>> http://margus.roo.ee
>>> skype: margusja
>>> +372 51 48 780
>>>
>>> On 22/02/16 09:14, Sandeep Samudrala wrote:
>>>
>>> Hi Margus,
>>>> Please do send such queries over users mailing list. Can you attach your
>>>> process definition and also can you check application.log. Please attach
>>>> any stack trace if any.
>>>>
>>>> Thanks,
>>>> -Sandeep
>>>> On Feb 22, 2016 12:28 PM, "Margus Roo" <ma...@roo.ee> wrote:
>>>>
>>>> Hi
>>>>
>>>>> I am using Falcon- 0.6.1.2.3 packaged by Hortonworks HDP-2.3
>>>>>
>>>>> I noticed that all my running processes go after some days in to
>>>>> UNKNOWN
>>>>> status. After restarting Falcon they are back in RUNNING status. And
>>>>> after
>>>>> some days it is repeating again.
>>>>>
>>>>> --
>>>>> Margus (margusja) Roo
>>>>> http://margus.roo.ee
>>>>> skype: margusja
>>>>> +372 51 48 780
>>>>>
>>>>>
>>>>>
>>>>>
>

Re: All running processes are in UNKNOWN status

Posted by Margus Roo <ma...@roo.ee>.
Found rows from log:

2016-02-22 08:54:48,273 INFO  - [126455586@qtp-525968792-61 - 
763a5818-27e2-4ada-8d45-50b06afffa8e:margusja:GET//entities/list/feed,process] 
~ {Action:list, Dimensions:{}, Status: SUCCEEDED, Time-taken:579935193 
ns} (METRIC:38)
2016-02-22 08:54:48,274 DEBUG - [126455586@qtp-525968792-61 - 
763a5818-27e2-4ada-8d45-50b06afffa8e:] ~ Audit: margusja/10.65.104.39 
performed request 
http://hadoopnn2.estpak.ee:15000/api/entities/list/feed,process?fields=clusters,tags,status&offset=0&numResults=10 
(88.196.164.43) at time 2016-02-22T06:54Z (FalconAuditFilter:86)
2016-02-22 08:55:10,388 INFO  - [ActiveMQ ShutdownHook:] ~ ActiveMQ 
Message Broker (localhost, 
ID:hadoopnn2.estpak.ee-48159-1455867360485-0:1) is shutting down 
(BrokerService:560)
2016-02-22 08:55:10,389 INFO  - [ActiveMQ ShutdownHook:] ~ Connector 
vm://localhost Stopped (TransportConnector:288)
2016-02-22 08:55:10,652 INFO  - [ActiveMQ Connection Executor: 
tcp://hadoopnn2.estpak.ee/88.196.164.43:61616:] ~ Error in onException 
for topicSubscriber of topic: FALCON.ENTITY.TOPIC (JMSMessageConsumer:144)
javax.jms.JMSException: java.io.EOFException
         at 
org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:49)
         at 
org.apache.activemq.ActiveMQConnection.onAsyncException(ActiveMQConnection.java:1833)
         at 
org.apache.activemq.ActiveMQConnection.onException(ActiveMQConnection.java:1850)
         at 
org.apache.activemq.transport.TransportFilter.onException(TransportFilter.java:101)
         at 
org.apache.activemq.transport.ResponseCorrelator.onException(ResponseCorrelator.java:126)
         at 
org.apache.activemq.transport.TransportFilter.onException(TransportFilter.java:101)
         at 
org.apache.activemq.transport.TransportFilter.onException(TransportFilter.java:101)
         at 
org.apache.activemq.transport.WireFormatNegotiator.onException(WireFormatNegotiator.java:160)
         at 
org.apache.activemq.transport.InactivityMonitor.onException(InactivityMonitor.java:266)
         at 
org.apache.activemq.transport.TransportSupport.onException(TransportSupport.java:96)
         at 
org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:206)
         at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.EOFException
         at java.io.DataInputStream.readInt(DataInputStream.java:392)
         at 
org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269)
         at 
org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227)
         at 
org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219)
         at 
org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202)
         ... 1 more


And before that there are loads of kerberos related problems:
2016-02-22 08:54:48,272 WARN  - [126455586@qtp-525968792-61 - 
763a5818-27e2-4ada-8d45-50b06afffa8e:margusja:GET//entities/list/feed,process] 
~ Exception while invoking class 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo 
over hadoopnn2.estpak.ee/88.196.164.43:8020. Not retrying because 
failovers (15) exceeded maximum allowed (15) (RetryInvocationHandler:121)
java.io.IOException: Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to 
find any Kerberos tgt)]; Host Details : local host is: 
"hadoopnn2.estpak.ee/88.196.164.43"; destination host is: 
"hadoopnn2.estpak.ee":8020;
         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)

But thous kerberos problems will resolve after falcon restart.

Anyway this is not the right list as I understand. Can you provide my 
user@ subscription e-mail?

Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780

On 22/02/16 09:31, Pallavi Rao wrote:
> It might not have to do with a particular process. It might go into UNKNOWN
> status when Falcon is unable to communicate with Oozie, for example. What
> will help in this case is the falcon.application.log (Falcon server logs).
>
> Regards,
> Pallavi
>
> On Mon, Feb 22, 2016 at 12:49 PM, Margus Roo <ma...@roo.ee> wrote:
>
>> It is difficult because I have already more than ten processes are running
>> and I do not know exact moment when they are going in to UNKNOWN status.
>> I just hoped that it had happened before and someone in this list have
>> ideas.
>> So you think it is related with processes?
>> Then I can start only one process and then I see is it going to UNKNOWN.
>>
>> I tried to subscribe to user@ list but no success. In falcon site I can
>> not find user list subscribe e-mail. If you can provide it I can ask help
>> from user list.
>>
>> Margus (margusja) Roo
>> http://margus.roo.ee
>> skype: margusja
>> +372 51 48 780
>>
>> On 22/02/16 09:14, Sandeep Samudrala wrote:
>>
>>> Hi Margus,
>>> Please do send such queries over users mailing list. Can you attach your
>>> process definition and also can you check application.log. Please attach
>>> any stack trace if any.
>>>
>>> Thanks,
>>> -Sandeep
>>> On Feb 22, 2016 12:28 PM, "Margus Roo" <ma...@roo.ee> wrote:
>>>
>>> Hi
>>>> I am using Falcon- 0.6.1.2.3 packaged by Hortonworks HDP-2.3
>>>>
>>>> I noticed that all my running processes go after some days in to UNKNOWN
>>>> status. After restarting Falcon they are back in RUNNING status. And
>>>> after
>>>> some days it is repeating again.
>>>>
>>>> --
>>>> Margus (margusja) Roo
>>>> http://margus.roo.ee
>>>> skype: margusja
>>>> +372 51 48 780
>>>>
>>>>
>>>>


Re: All running processes are in UNKNOWN status

Posted by Pallavi Rao <pa...@inmobi.com>.
It might not have to do with a particular process. It might go into UNKNOWN
status when Falcon is unable to communicate with Oozie, for example. What
will help in this case is the falcon.application.log (Falcon server logs).

Regards,
Pallavi

On Mon, Feb 22, 2016 at 12:49 PM, Margus Roo <ma...@roo.ee> wrote:

> It is difficult because I have already more than ten processes are running
> and I do not know exact moment when they are going in to UNKNOWN status.
> I just hoped that it had happened before and someone in this list have
> ideas.
> So you think it is related with processes?
> Then I can start only one process and then I see is it going to UNKNOWN.
>
> I tried to subscribe to user@ list but no success. In falcon site I can
> not find user list subscribe e-mail. If you can provide it I can ask help
> from user list.
>
> Margus (margusja) Roo
> http://margus.roo.ee
> skype: margusja
> +372 51 48 780
>
> On 22/02/16 09:14, Sandeep Samudrala wrote:
>
>> Hi Margus,
>> Please do send such queries over users mailing list. Can you attach your
>> process definition and also can you check application.log. Please attach
>> any stack trace if any.
>>
>> Thanks,
>> -Sandeep
>> On Feb 22, 2016 12:28 PM, "Margus Roo" <ma...@roo.ee> wrote:
>>
>> Hi
>>>
>>> I am using Falcon- 0.6.1.2.3 packaged by Hortonworks HDP-2.3
>>>
>>> I noticed that all my running processes go after some days in to UNKNOWN
>>> status. After restarting Falcon they are back in RUNNING status. And
>>> after
>>> some days it is repeating again.
>>>
>>> --
>>> Margus (margusja) Roo
>>> http://margus.roo.ee
>>> skype: margusja
>>> +372 51 48 780
>>>
>>>
>>>
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

Re: All running processes are in UNKNOWN status

Posted by Margus Roo <ma...@roo.ee>.
It is difficult because I have already more than ten processes are 
running and I do not know exact moment when they are going in to UNKNOWN 
status.
I just hoped that it had happened before and someone in this list have 
ideas.
So you think it is related with processes?
Then I can start only one process and then I see is it going to UNKNOWN.

I tried to subscribe to user@ list but no success. In falcon site I can 
not find user list subscribe e-mail. If you can provide it I can ask 
help from user list.

Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780

On 22/02/16 09:14, Sandeep Samudrala wrote:
> Hi Margus,
> Please do send such queries over users mailing list. Can you attach your
> process definition and also can you check application.log. Please attach
> any stack trace if any.
>
> Thanks,
> -Sandeep
> On Feb 22, 2016 12:28 PM, "Margus Roo" <ma...@roo.ee> wrote:
>
>> Hi
>>
>> I am using Falcon- 0.6.1.2.3 packaged by Hortonworks HDP-2.3
>>
>> I noticed that all my running processes go after some days in to UNKNOWN
>> status. After restarting Falcon they are back in RUNNING status. And after
>> some days it is repeating again.
>>
>> --
>> Margus (margusja) Roo
>> http://margus.roo.ee
>> skype: margusja
>> +372 51 48 780
>>
>>


Re: All running processes are in UNKNOWN status

Posted by Sandeep Samudrala <sa...@gmail.com>.
Hi Margus,
Please do send such queries over users mailing list. Can you attach your
process definition and also can you check application.log. Please attach
any stack trace if any.

Thanks,
-Sandeep
On Feb 22, 2016 12:28 PM, "Margus Roo" <ma...@roo.ee> wrote:

> Hi
>
> I am using Falcon- 0.6.1.2.3 packaged by Hortonworks HDP-2.3
>
> I noticed that all my running processes go after some days in to UNKNOWN
> status. After restarting Falcon they are back in RUNNING status. And after
> some days it is repeating again.
>
> --
> Margus (margusja) Roo
> http://margus.roo.ee
> skype: margusja
> +372 51 48 780
>
>