You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ambari.apache.org by Marco <ma...@gmail.com> on 2015/07/01 15:42:13 UTC
Restart of flume-agents bug
Hi,
I've troubles when restarting flume agents with ambari.
I've found this jira entry
https://issues.apache.org/jira/browse/AMBARI-10657, which describes my
problem (var/run/flume/a2.pid' returned 1.
Since I am using the hortonworks distribution (ambari 2.0.0) I cannot just
upgrade/patch...is there any workaround for this issue? I've tried to
delete the pid file but with no effect.
Thanks,
Marco
Re: Restart of flume-agents bug
Posted by Marco <ma...@gmail.com>.
I cannot check it before next week but I do not assume that it is an config
error, since one agent on one node is running. I had the issue also several
times with the hortonworks sandbox(locally in vmware) but was not able to
reproduce it.
I'll let you know the output...
BR Marco
2015-07-02 2:27 GMT+02:00 Sumit Mohanty <sm...@hortonworks.com>:
> What does execution of
>
> pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.*
>
> generate as output? *If you have a single agent you can also use ps aux
> and grep for flume*
>
>
> In my case, for example, I see
> [root@smb201-1 ~]# pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*a1.*
> 16794
>
> Is it possible that the configuration for the flume agent "agent1" may
> have some issue?
>
>
> ------------------------------
> *From:* Marco <ma...@gmail.com>
> *Sent:* Wednesday, July 01, 2015 7:27 AM
>
> *To:* user@ambari.apache.org
> *Subject:* Re: Restart of flume-agents bug
>
> error:
> <<<<
> File
> "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
> line 214, in execute
> method(env)
> File
> "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line
> 89, in thunk
> return fn(*args, **kwargs)
> File
> "/var/lib/ambari-agent/cache/common-services/FLUME/1.4.0.2.0/package/scripts/flume_handler.py",
> line 56, in start
> flume(action='start')
> File
> "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line
> 89, in thunk
> return fn(*args, **kwargs)
> File
> "/var/lib/ambari-agent/cache/common-services/FLUME/1.4.0.2.0/package/scripts/flume.py",
> line 161, in flume
> try_sleep=10)
> File
> "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line
> 148, in __init__
> self.env.run()
> File
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
> line 152, in run
> self.run_action(resource, action)
> File
> "/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
> line 118, in run_action
> provider_action()
> File
> "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
> line 274, in action_run
> raise ex
> Fail: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.*
> > /var/run/flume/agent1.pid' returned 1.
> >>>>
>
>
> output:
> <<<
> 2015-07-01 14:08:03,131 - u'Execute[\'ambari-sudo.sh su flume -l -s
> /bin/bash -c \'export
> PATH=\'"\'"\'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent\'"\'"\\
> ' JAVA_HOME=/usr/jdk64/jdk1.7.0_67 ;
> /usr/hdp/current/flume-server/bin/flume-ng agent --name agent1 --conf
> /etc/flume/conf/agent1 --conf-file /etc/flume/conf/agent1/flume.conf
> -Dflume.monitoring.type=org\
> .apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink
> -Dflume.monitoring.node=hostname:6188 > /var/log/flume/agent1.out 2>&1\'
> &\']' {'environment': {'JAVA_HOME': u'/usr/jdk64/jd\
> k1.7.0_67'}, 'wait_for_finish': False}
> 2015-07-01 14:08:03,136 - u"Execute['pgrep -o -u flume -f
> ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid']"
> {'logoutput': True, 'tries': 20, 'try_sleep': 10}
> 2015-07-01 14:08:03,179 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:08:13,233 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:08:23,280 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:08:33,334 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:08:43,389 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:08:53,440 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:09:03,511 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:09:13,565 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:09:23,619 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:09:33,673 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:09:43,722 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:09:53,772 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:10:03,826 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:10:13,880 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:10:23,928 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:10:33,982 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:10:44,037 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:10:54,083 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:11:04,137 - Retrying after 10 seconds. Reason: Execution of
> 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
> /var/run/flume/agent1.pid' returned 1.
> 2015-07-01 14:11:14,190 - Error while executing command 'start':
> Traceback (most recent call last):
> File
> "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
> line 214, in execute
> method(env)
>
> >>>
>
> Thanks,
> Marco
>
> 2015-07-01 16:18 GMT+02:00 Sumit Mohanty <sm...@hortonworks.com>:
>
>> When you start Flume using Ambari - /var/lib/ambari-agent/data folder
>> on the host will have corresponding command outputs/errors etc. Can you
>> share those?
>>
>>
>> Feel free to send a direct email as I think Apache email will not let
>> attachments.
>> ------------------------------
>> *From:* Marco <ma...@gmail.com>
>> *Sent:* Wednesday, July 01, 2015 7:14 AM
>> *To:* user@ambari.apache.org
>> *Subject:* Re: Restart of flume-agents bug
>>
>> I've tried this but do not find any related processes
>>
>> I've searched via
>> pgrep -fl flume
>> pgrep -fl agent1
>>
>> Also, I've restarted the corresponding server.
>>
>> If I try to restart the flume agent, I get the same issue :(
>>
>> I've also tried to delete /var/run/flume and create it again....also no
>> effect.
>>
>> BR Marco
>>
>>
>>
>> 2015-07-01 15:57 GMT+02:00 Sumit Mohanty <sm...@hortonworks.com>:
>>
>>> If flume agents are running then you need to kill those processes as
>>> well along with deleting the pid files.
>>> ------------------------------
>>> *From:* Marco <ma...@gmail.com>
>>> *Sent:* Wednesday, July 01, 2015 6:42 AM
>>> *To:* user@ambari.apache.org
>>> *Subject:* Restart of flume-agents bug
>>>
>>> Hi,
>>>
>>> I've troubles when restarting flume agents with ambari.
>>>
>>> I've found this jira entry
>>> https://issues.apache.org/jira/browse/AMBARI-10657, which describes my
>>> problem (var/run/flume/a2.pid' returned 1.
>>>
>>> Since I am using the hortonworks distribution (ambari 2.0.0) I cannot
>>> just upgrade/patch...is there any workaround for this issue? I've tried to
>>> delete the pid file but with no effect.
>>>
>>> Thanks,
>>> Marco
>>>
>>
>>
>>
>> --
>> Viele Grüße,
>> Marco
>>
>
>
>
> --
> Viele Grüße,
> Marco
>
--
Viele Grüße,
Marco
Re: Restart of flume-agents bug
Posted by Sumit Mohanty <sm...@hortonworks.com>.
What does execution of
pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.*
generate as output? If you have a single agent you can also use ps aux and grep for flume
In my case, for example, I see
[root@smb201-1 ~]# pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*a1.*
16794
Is it possible that the configuration for the flume agent "agent1" may have some issue?
________________________________
From: Marco <ma...@gmail.com>
Sent: Wednesday, July 01, 2015 7:27 AM
To: user@ambari.apache.org
Subject: Re: Restart of flume-agents bug
error:
<<<<
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 214, in execute
method(env)
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/FLUME/1.4.0.2.0/package/scripts/flume_handler.py", line 56, in start
flume(action='start')
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
return fn(*args, **kwargs)
File "/var/lib/ambari-agent/cache/common-services/FLUME/1.4.0.2.0/package/scripts/flume.py", line 161, in flume
try_sleep=10)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 274, in action_run
raise ex
Fail: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
>>>>
output:
<<<
2015-07-01 14:08:03,131 - u'Execute[\'ambari-sudo.sh su flume -l -s /bin/bash -c \'export PATH=\'"\'"\'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent\'"\'"\\
' JAVA_HOME=/usr/jdk64/jdk1.7.0_67 ; /usr/hdp/current/flume-server/bin/flume-ng agent --name agent1 --conf /etc/flume/conf/agent1 --conf-file /etc/flume/conf/agent1/flume.conf -Dflume.monitoring.type=org\
.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink -Dflume.monitoring.node=hostname:6188 > /var/log/flume/agent1.out 2>&1\' &\']' {'environment': {'JAVA_HOME': u'/usr/jdk64/jd\
k1.7.0_67'}, 'wait_for_finish': False}
2015-07-01 14:08:03,136 - u"Execute['pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid']" {'logoutput': True, 'tries': 20, 'try_sleep': 10}
2015-07-01 14:08:03,179 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:13,233 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:23,280 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:33,334 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:43,389 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:53,440 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:03,511 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:13,565 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:23,619 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:33,673 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:43,722 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:53,772 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:03,826 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:13,880 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:23,928 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:33,982 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:44,037 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:54,083 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:11:04,137 - Retrying after 10 seconds. Reason: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid' returned 1.
2015-07-01 14:11:14,190 - Error while executing command 'start':
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 214, in execute
method(env)
>>>
Thanks,
Marco
2015-07-01 16:18 GMT+02:00 Sumit Mohanty <sm...@hortonworks.com>>:
When you start Flume using Ambari - /var/lib/ambari-agent/data folder on the host will have corresponding command outputs/errors etc. Can you share those?
Feel free to send a direct email as I think Apache email will not let attachments.
________________________________
From: Marco <ma...@gmail.com>>
Sent: Wednesday, July 01, 2015 7:14 AM
To: user@ambari.apache.org<ma...@ambari.apache.org>
Subject: Re: Restart of flume-agents bug
I've tried this but do not find any related processes
I've searched via
pgrep -fl flume
pgrep -fl agent1
Also, I've restarted the corresponding server.
If I try to restart the flume agent, I get the same issue :(
I've also tried to delete /var/run/flume and create it again....also no effect.
BR Marco
2015-07-01 15:57 GMT+02:00 Sumit Mohanty <sm...@hortonworks.com>>:
If flume agents are running then you need to kill those processes as well along with deleting the pid files.
________________________________
From: Marco <ma...@gmail.com>>
Sent: Wednesday, July 01, 2015 6:42 AM
To: user@ambari.apache.org<ma...@ambari.apache.org>
Subject: Restart of flume-agents bug
Hi,
I've troubles when restarting flume agents with ambari.
I've found this jira entry https://issues.apache.org/jira/browse/AMBARI-10657, which describes my problem (var/run/flume/a2.pid' returned 1.
Since I am using the hortonworks distribution (ambari 2.0.0) I cannot just upgrade/patch...is there any workaround for this issue? I've tried to delete the pid file but with no effect.
Thanks,
Marco
--
Viele Grüße,
Marco
--
Viele Grüße,
Marco
Re: Restart of flume-agents bug
Posted by Marco <ma...@gmail.com>.
error:
<<<<
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 214, in execute
method(env)
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py",
line 89, in thunk
return fn(*args, **kwargs)
File
"/var/lib/ambari-agent/cache/common-services/FLUME/1.4.0.2.0/package/scripts/flume_handler.py",
line 56, in start
flume(action='start')
File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py",
line 89, in thunk
return fn(*args, **kwargs)
File
"/var/lib/ambari-agent/cache/common-services/FLUME/1.4.0.2.0/package/scripts/flume.py",
line 161, in flume
try_sleep=10)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py",
line 148, in __init__
self.env.run()
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 152, in run
self.run_action(resource, action)
File
"/usr/lib/python2.6/site-packages/resource_management/core/environment.py",
line 118, in run_action
provider_action()
File
"/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py",
line 274, in action_run
raise ex
Fail: Execution of 'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.*
> /var/run/flume/agent1.pid' returned 1.
>>>>
output:
<<<
2015-07-01 14:08:03,131 - u'Execute[\'ambari-sudo.sh su flume -l -s
/bin/bash -c \'export
PATH=\'"\'"\'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent\'"\'"\\
' JAVA_HOME=/usr/jdk64/jdk1.7.0_67 ;
/usr/hdp/current/flume-server/bin/flume-ng agent --name agent1 --conf
/etc/flume/conf/agent1 --conf-file /etc/flume/conf/agent1/flume.conf
-Dflume.monitoring.type=org\
.apache.hadoop.metrics2.sink.flume.FlumeTimelineMetricsSink
-Dflume.monitoring.node=hostname:6188 > /var/log/flume/agent1.out 2>&1\'
&\']' {'environment': {'JAVA_HOME': u'/usr/jdk64/jd\
k1.7.0_67'}, 'wait_for_finish': False}
2015-07-01 14:08:03,136 - u"Execute['pgrep -o -u flume -f
^/usr/jdk64/jdk1.7.0_67.*agent1.* > /var/run/flume/agent1.pid']"
{'logoutput': True, 'tries': 20, 'try_sleep': 10}
2015-07-01 14:08:03,179 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:13,233 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:23,280 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:33,334 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:43,389 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:08:53,440 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:03,511 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:13,565 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:23,619 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:33,673 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:43,722 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:09:53,772 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:03,826 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:13,880 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:23,928 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:33,982 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:44,037 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:10:54,083 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:11:04,137 - Retrying after 10 seconds. Reason: Execution of
'pgrep -o -u flume -f ^/usr/jdk64/jdk1.7.0_67.*agent1.* >
/var/run/flume/agent1.pid' returned 1.
2015-07-01 14:11:14,190 - Error while executing command 'start':
Traceback (most recent call last):
File
"/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 214, in execute
method(env)
>>>
Thanks,
Marco
2015-07-01 16:18 GMT+02:00 Sumit Mohanty <sm...@hortonworks.com>:
> When you start Flume using Ambari - /var/lib/ambari-agent/data folder
> on the host will have corresponding command outputs/errors etc. Can you
> share those?
>
>
> Feel free to send a direct email as I think Apache email will not let
> attachments.
> ------------------------------
> *From:* Marco <ma...@gmail.com>
> *Sent:* Wednesday, July 01, 2015 7:14 AM
> *To:* user@ambari.apache.org
> *Subject:* Re: Restart of flume-agents bug
>
> I've tried this but do not find any related processes
>
> I've searched via
> pgrep -fl flume
> pgrep -fl agent1
>
> Also, I've restarted the corresponding server.
>
> If I try to restart the flume agent, I get the same issue :(
>
> I've also tried to delete /var/run/flume and create it again....also no
> effect.
>
> BR Marco
>
>
>
> 2015-07-01 15:57 GMT+02:00 Sumit Mohanty <sm...@hortonworks.com>:
>
>> If flume agents are running then you need to kill those processes as
>> well along with deleting the pid files.
>> ------------------------------
>> *From:* Marco <ma...@gmail.com>
>> *Sent:* Wednesday, July 01, 2015 6:42 AM
>> *To:* user@ambari.apache.org
>> *Subject:* Restart of flume-agents bug
>>
>> Hi,
>>
>> I've troubles when restarting flume agents with ambari.
>>
>> I've found this jira entry
>> https://issues.apache.org/jira/browse/AMBARI-10657, which describes my
>> problem (var/run/flume/a2.pid' returned 1.
>>
>> Since I am using the hortonworks distribution (ambari 2.0.0) I cannot
>> just upgrade/patch...is there any workaround for this issue? I've tried to
>> delete the pid file but with no effect.
>>
>> Thanks,
>> Marco
>>
>
>
>
> --
> Viele Grüße,
> Marco
>
--
Viele Grüße,
Marco
Re: Restart of flume-agents bug
Posted by Sumit Mohanty <sm...@hortonworks.com>.
?When you start Flume using Ambari - /var/lib/ambari-agent/data folder on the host will have corresponding command outputs/errors etc. Can you share those?
Feel free to send a direct email as I think Apache email will not let attachments.
________________________________
From: Marco <ma...@gmail.com>
Sent: Wednesday, July 01, 2015 7:14 AM
To: user@ambari.apache.org
Subject: Re: Restart of flume-agents bug
I've tried this but do not find any related processes
I've searched via
pgrep -fl flume
pgrep -fl agent1
Also, I've restarted the corresponding server.
If I try to restart the flume agent, I get the same issue :(
I've also tried to delete /var/run/flume and create it again....also no effect.
BR Marco
2015-07-01 15:57 GMT+02:00 Sumit Mohanty <sm...@hortonworks.com>>:
?If flume agents are running then you need to kill those processes as well along with deleting the pid files.
________________________________
From: Marco <ma...@gmail.com>>
Sent: Wednesday, July 01, 2015 6:42 AM
To: user@ambari.apache.org<ma...@ambari.apache.org>
Subject: Restart of flume-agents bug
Hi,
I've troubles when restarting flume agents with ambari.
I've found this jira entry https://issues.apache.org/jira/browse/AMBARI-10657, which describes my problem (var/run/flume/a2.pid' returned 1.
Since I am using the hortonworks distribution (ambari 2.0.0) I cannot just upgrade/patch...is there any workaround for this issue? I've tried to delete the pid file but with no effect.
Thanks,
Marco
--
Viele Grüße,
Marco
Re: Restart of flume-agents bug
Posted by Marco <ma...@gmail.com>.
I've tried this but do not find any related processes
I've searched via
pgrep -fl flume
pgrep -fl agent1
Also, I've restarted the corresponding server.
If I try to restart the flume agent, I get the same issue :(
I've also tried to delete /var/run/flume and create it again....also no
effect.
BR Marco
2015-07-01 15:57 GMT+02:00 Sumit Mohanty <sm...@hortonworks.com>:
> If flume agents are running then you need to kill those processes as
> well along with deleting the pid files.
> ------------------------------
> *From:* Marco <ma...@gmail.com>
> *Sent:* Wednesday, July 01, 2015 6:42 AM
> *To:* user@ambari.apache.org
> *Subject:* Restart of flume-agents bug
>
> Hi,
>
> I've troubles when restarting flume agents with ambari.
>
> I've found this jira entry
> https://issues.apache.org/jira/browse/AMBARI-10657, which describes my
> problem (var/run/flume/a2.pid' returned 1.
>
> Since I am using the hortonworks distribution (ambari 2.0.0) I cannot
> just upgrade/patch...is there any workaround for this issue? I've tried to
> delete the pid file but with no effect.
>
> Thanks,
> Marco
>
--
Viele Grüße,
Marco
Re: Restart of flume-agents bug
Posted by Sumit Mohanty <sm...@hortonworks.com>.
?If flume agents are running then you need to kill those processes as well along with deleting the pid files.
________________________________
From: Marco <ma...@gmail.com>
Sent: Wednesday, July 01, 2015 6:42 AM
To: user@ambari.apache.org
Subject: Restart of flume-agents bug
Hi,
I've troubles when restarting flume agents with ambari.
I've found this jira entry https://issues.apache.org/jira/browse/AMBARI-10657, which describes my problem (var/run/flume/a2.pid' returned 1.
Since I am using the hortonworks distribution (ambari 2.0.0) I cannot just upgrade/patch...is there any workaround for this issue? I've tried to delete the pid file but with no effect.
Thanks,
Marco
Re: Restart of flume-agents bug
Posted by Marco <ma...@gmail.com>.
The config was indeed broken for a couple of nodes....thx anyways for your
help...
2015-07-01 15:42 GMT+02:00 Marco <ma...@gmail.com>:
> Hi,
>
> I've troubles when restarting flume agents with ambari.
>
> I've found this jira entry
> https://issues.apache.org/jira/browse/AMBARI-10657, which describes my
> problem (var/run/flume/a2.pid' returned 1.
>
> Since I am using the hortonworks distribution (ambari 2.0.0) I cannot just
> upgrade/patch...is there any workaround for this issue? I've tried to
> delete the pid file but with no effect.
>
> Thanks,
> Marco
>
--
Viele Grüße,
Marco