You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by Sachin Pasalkar <Sa...@symantec.com> on 2017/01/17 07:28:07 UTC

Stop user from killing topology before X (configured) amount of time

Currently user can kill topology directly without waiting for some amount of time so that all inflight messages will get processed.  For example, storm is writing to file & user kills topology, file is not closed or moved to proper location. We need to educate operation guys to do the right things also there are some chances that it will be not followed causing system to go in inconsistent state.

Can we set mandatory timeout (configurable) when user kills storm topology? User should not be allowed kill topology with time less than mentioned time.

Some case:
1) If topology is long running don't allow user to kill but time not less than mentioned one
2) If topology is just deployed allow him to kill instantly (as it might be some mistake)
3) Handle same cases from command-line.

Thanks,
Sachin


Re: Stop user from killing topology before X (configured) amount of time

Posted by Bobby Evans <ev...@yahoo-inc.com.INVALID>.
1.x and master have diverged a lot.  If you want this on 1.x you will need to create two pull requests one for master and one for 1.x-branch https://github.com/apache/storm/tree/1.x-branch
The code you want to look at for this is here
https://github.com/apache/storm/blob/1.x-branch/storm-core/src/clj/org/apache/storm/daemon/nimbus.clj#L1722-L1734

- Bobby

On Wednesday, January 18, 2017, 10:47:37 AM CST, Sachin Pasalkar <Sa...@symantec.com> wrote:Sorry for spamming.

Can someone help in pointing out branch, I want to fix this if possible?

Thanks,
Sachin

On 18/01/17, 8:06 AM, "Sachin Pasalkar" <Sa...@symantec.com>
wrote:

>Can you point out the branch which I need to checkout? I don¹t see below
>Nimbus class in 1.x branch
>
>On 17/01/17, 10:35 PM, "Sachin Pasalkar" <Sa...@symantec.com>
>wrote:
>
>>Hi Bobby,
>>
>>Thanks for the response. I have created JIRA
>>https://issues.apache.org/jira/browse/STORM-2299. I will try to take look
>>at it. I may ask some information if needed.
>>
>>Regards,
>>Sachin
>>
>>From: Bobby Evans <ev...@yahoo-inc.com>>
>>Date: Tuesday, 17 January 2017 at 9:16 PM
>>To: Sachin Pasalkar
>><sa...@symantec.com>>,
>>"dev@storm.apache.org<ma...@storm.apache.org>"
>><de...@storm.apache.org>>
>>Subject: Re: Stop user from killing topology before X (configured) amount
>>of time
>>
>>I would like to add that it would be good to have an admin override on
>>this.  If someone accidentally makes the wait time 100 mins instead of
>>100 seconds, it would be good to have an admin be able to really truly
>>kill it faster.
>>
>>
>>- Bobby
>>
>>
>>On Tuesday, January 17, 2017, 9:44:08 AM CST, Bobby Evans
>><ev...@yahoo-inc.com>> wrote:
>>In order to kill a topology with no wait period the operator needs to
>>supply extra arguments `-w 0`  or the code needs to be a few lines longer
>>to pass in the KillOptions with a 0 timeout.  If you want a configured
>>minimum timeout for a given topology I think that would be perfectly
>>fine.  We do not currently support that, but please file a JIRA and
>>hopefully someone can take a look at supporting it.  You can probably do
>>a lot of the work yourself if you want to.
>>
>>The function you care about is here
>>
>>https://clicktime.symantec.com/a/1/2fQ9x0Ghmi7zyKyqEgy_eMCiyDksen0wXowwyW
>>8
>>raeU=?d=oSVAgk8qtKLjp8SsSKKdQGeGgwrYSL4YLBTdav_tfzjRGHdVjUPRdIC8uP3f_19HL
>>b
>>QP_DmLNtAZG97XkzvNRzrF0SE3L3kmm1F6S6RMFrhb6YXb8IB1VNtBHLb5glLccKrdvKNxAEq
>>P
>>HC7RTFNlxcw42TSI5In7DvC-ksZPivf17z1pQ61L8oEHZbbHJQ3nnzD96ILXL7qpLye-Yrp7L
>>0
>>lmoCqBAnAiaeiM3MXH_YN-ONcWqUxHEzsQE3TZI2W92lJOeYqCwKpp-2kSouqAvgnwrSquKgd
>>X
>>RmAbba8s4n-S4sNvE5KFLnZ9Lshhw70II-r9N4iEMNOfvFs6l90YrxoDwn8ZLL8_msDty9PSD
>>v
>>2-PiKdLrPYywp0XC4S8NJlYa4O6ZA6XiYEzkvNOI4MnQji69z8g8dtKNmymg4OFIf-gUmYqQz
>>l
>>2Ci1rFXghTNo7yENJyeoJxzcqz3azveiTlKB0-KPxw%3D%3D&u=https%3A%2F%2Fgithub.c
>>o
>>m%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0522b0367f3efdba2689089%2Fst
>>o
>>rm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fdaemon%2Fnimbus%2FNimbus.jav
>>a
>>%23L2573-L2595<https://clicktime.symantec.com/a/1/hMFfR0x0xhEPmnLkHrzWh1u
>>M
>>1FSHoBzqUa0fgE8KPhI=?d=MS-UWN_vi-ZagJt0xU9IbvMa_Sn5eMGVNdZjVChgjfQhVEoPck
>>k
>>5VmUO2oDaDYkTCwElmne6IQyPHIs9Xsx084v1kwUi12v19jPqFi2LdGRZlDEGKeq1Gmvap2me
>>3
>>KrdZ8XQlgz9QYP6tI9JZQWmvDxUG05nMBb-jaYIpO18xE0MHYoRK2-h_USW97P7EB7pfMIvXZ
>>l
>>6w-WWJdIDa9H2Eyc1tV1KXa86gDMqWmas7nf3C5nXp1-PHc6f6iQ3IwxC5aybtUIeNAppWBP8
>>O
>>YeXZ2wiQkzAplSlfDw4ITeSXx0MDEla47QjArk-uAlgsRCv7i-i746Yt2NXxUgpMd2HHhTQqr
>>J
>>ZGTV9QWQlgSG0K47u2TK1xmxZNzdzjHK_wrJ07ilUKeIVsThyyA_Jm7wg4Qwwf1dwnkVnt6zY
>>E
>>h3Ze6kE&u=https%3A%2F%2Fgithub.com%2Fapache%2Fstorm%2Fblob%2F51c8474143b0
>>0
>>81ff0522b0367f3efdba2689089%2Fstorm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fsto
>>r
>>m%2Fdaemon%2Fnimbus%2FNimbus.java%23L2573-L2595>
>>
>>and it really would be mostly inserting a check
>>
>>probably after this line
>>
>>https://clicktime.symantec.com/a/1/KLseHXZon-yj2uKWnuvOIPZLuJ9329NAZDoWY5
>>_
>>gPgQ=?d=oSVAgk8qtKLjp8SsSKKdQGeGgwrYSL4YLBTdav_tfzjRGHdVjUPRdIC8uP3f_19HL
>>b
>>QP_DmLNtAZG97XkzvNRzrF0SE3L3kmm1F6S6RMFrhb6YXb8IB1VNtBHLb5glLccKrdvKNxAEq
>>P
>>HC7RTFNlxcw42TSI5In7DvC-ksZPivf17z1pQ61L8oEHZbbHJQ3nnzD96ILXL7qpLye-Yrp7L
>>0
>>lmoCqBAnAiaeiM3MXH_YN-ONcWqUxHEzsQE3TZI2W92lJOeYqCwKpp-2kSouqAvgnwrSquKgd
>>X
>>RmAbba8s4n-S4sNvE5KFLnZ9Lshhw70II-r9N4iEMNOfvFs6l90YrxoDwn8ZLL8_msDty9PSD
>>v
>>2-PiKdLrPYywp0XC4S8NJlYa4O6ZA6XiYEzkvNOI4MnQji69z8g8dtKNmymg4OFIf-gUmYqQz
>>l
>>2Ci1rFXghTNo7yENJyeoJxzcqz3azveiTlKB0-KPxw%3D%3D&u=https%3A%2F%2Fgithub.c
>>o
>>m%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0522b0367f3efdba2689089%2Fst
>>o
>>rm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fdaemon%2Fnimbus%2FNimbus.jav
>>a
>>%23L2583<https://clicktime.symantec.com/a/1/vgKlrHDYwUXdyawdaYGs46xwhu532
>>9
>>aNEdr3WD7HDrE=?d=MS-UWN_vi-ZagJt0xU9IbvMa_Sn5eMGVNdZjVChgjfQhVEoPckk5VmUO
>>2
>>oDaDYkTCwElmne6IQyPHIs9Xsx084v1kwUi12v19jPqFi2LdGRZlDEGKeq1Gmvap2me3KrdZ8
>>X
>>Qlgz9QYP6tI9JZQWmvDxUG05nMBb-jaYIpO18xE0MHYoRK2-h_USW97P7EB7pfMIvXZl6w-WW
>>J
>>dIDa9H2Eyc1tV1KXa86gDMqWmas7nf3C5nXp1-PHc6f6iQ3IwxC5aybtUIeNAppWBP8OYeXZ2
>>w
>>iQkzAplSlfDw4ITeSXx0MDEla47QjArk-uAlgsRCv7i-i746Yt2NXxUgpMd2HHhTQqrJZGTV9
>>Q
>>WQlgSG0K47u2TK1xmxZNzdzjHK_wrJ07ilUKeIVsThyyA_Jm7wg4Qwwf1dwnkVnt6zYEh3Ze6
>>k
>>E&u=https%3A%2F%2Fgithub.com%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0
>>5
>>22b0367f3efdba2689089%2Fstorm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fd
>>a
>>emon%2Fnimbus%2FNimbus.java%23L2583>
>>
>>to be sure the waitAmount is >= the configured minimum.
>>
>>
>>- Bobby
>>
>>
>>On Tuesday, January 17, 2017, 1:28:32 AM CST, Sachin Pasalkar
>><Sa...@symantec.com>>
>>wrote:
>>Currently user can kill topology directly without waiting for some amount
>>of time so that all inflight messages will get processed.  For example,
>>storm is writing to file & user kills topology, file is not closed or
>>moved to proper location. We need to educate operation guys to do the
>>right things also there are some chances that it will be not followed
>>causing system to go in inconsistent state.
>>
>>Can we set mandatory timeout (configurable) when user kills storm
>>topology? User should not be allowed kill topology with time less than
>>mentioned time.
>>
>>Some case:
>>1) If topology is long running don't allow user to kill but time not less
>>than mentioned one
>>2) If topology is just deployed allow him to kill instantly (as it might
>>be some mistake)
>>3) Handle same cases from command-line.
>>
>>Thanks,
>>Sachin
>


Re: Stop user from killing topology before X (configured) amount of time

Posted by Sachin Pasalkar <Sa...@symantec.com>.
Sorry for spamming.

Can someone help in pointing out branch, I want to fix this if possible?

Thanks,
Sachin

On 18/01/17, 8:06 AM, "Sachin Pasalkar" <Sa...@symantec.com>
wrote:

>Can you point out the branch which I need to checkout? I don¹t see below
>Nimbus class in 1.x branch
>
>On 17/01/17, 10:35 PM, "Sachin Pasalkar" <Sa...@symantec.com>
>wrote:
>
>>Hi Bobby,
>>
>>Thanks for the response. I have created JIRA
>>https://issues.apache.org/jira/browse/STORM-2299. I will try to take look
>>at it. I may ask some information if needed.
>>
>>Regards,
>>Sachin
>>
>>From: Bobby Evans <ev...@yahoo-inc.com>>
>>Date: Tuesday, 17 January 2017 at 9:16 PM
>>To: Sachin Pasalkar
>><sa...@symantec.com>>,
>>"dev@storm.apache.org<ma...@storm.apache.org>"
>><de...@storm.apache.org>>
>>Subject: Re: Stop user from killing topology before X (configured) amount
>>of time
>>
>>I would like to add that it would be good to have an admin override on
>>this.  If someone accidentally makes the wait time 100 mins instead of
>>100 seconds, it would be good to have an admin be able to really truly
>>kill it faster.
>>
>>
>>- Bobby
>>
>>
>>On Tuesday, January 17, 2017, 9:44:08 AM CST, Bobby Evans
>><ev...@yahoo-inc.com>> wrote:
>>In order to kill a topology with no wait period the operator needs to
>>supply extra arguments `-w 0`  or the code needs to be a few lines longer
>>to pass in the KillOptions with a 0 timeout.  If you want a configured
>>minimum timeout for a given topology I think that would be perfectly
>>fine.  We do not currently support that, but please file a JIRA and
>>hopefully someone can take a look at supporting it.  You can probably do
>>a lot of the work yourself if you want to.
>>
>>The function you care about is here
>>
>>https://clicktime.symantec.com/a/1/2fQ9x0Ghmi7zyKyqEgy_eMCiyDksen0wXowwyW
>>8
>>raeU=?d=oSVAgk8qtKLjp8SsSKKdQGeGgwrYSL4YLBTdav_tfzjRGHdVjUPRdIC8uP3f_19HL
>>b
>>QP_DmLNtAZG97XkzvNRzrF0SE3L3kmm1F6S6RMFrhb6YXb8IB1VNtBHLb5glLccKrdvKNxAEq
>>P
>>HC7RTFNlxcw42TSI5In7DvC-ksZPivf17z1pQ61L8oEHZbbHJQ3nnzD96ILXL7qpLye-Yrp7L
>>0
>>lmoCqBAnAiaeiM3MXH_YN-ONcWqUxHEzsQE3TZI2W92lJOeYqCwKpp-2kSouqAvgnwrSquKgd
>>X
>>RmAbba8s4n-S4sNvE5KFLnZ9Lshhw70II-r9N4iEMNOfvFs6l90YrxoDwn8ZLL8_msDty9PSD
>>v
>>2-PiKdLrPYywp0XC4S8NJlYa4O6ZA6XiYEzkvNOI4MnQji69z8g8dtKNmymg4OFIf-gUmYqQz
>>l
>>2Ci1rFXghTNo7yENJyeoJxzcqz3azveiTlKB0-KPxw%3D%3D&u=https%3A%2F%2Fgithub.c
>>o
>>m%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0522b0367f3efdba2689089%2Fst
>>o
>>rm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fdaemon%2Fnimbus%2FNimbus.jav
>>a
>>%23L2573-L2595<https://clicktime.symantec.com/a/1/hMFfR0x0xhEPmnLkHrzWh1u
>>M
>>1FSHoBzqUa0fgE8KPhI=?d=MS-UWN_vi-ZagJt0xU9IbvMa_Sn5eMGVNdZjVChgjfQhVEoPck
>>k
>>5VmUO2oDaDYkTCwElmne6IQyPHIs9Xsx084v1kwUi12v19jPqFi2LdGRZlDEGKeq1Gmvap2me
>>3
>>KrdZ8XQlgz9QYP6tI9JZQWmvDxUG05nMBb-jaYIpO18xE0MHYoRK2-h_USW97P7EB7pfMIvXZ
>>l
>>6w-WWJdIDa9H2Eyc1tV1KXa86gDMqWmas7nf3C5nXp1-PHc6f6iQ3IwxC5aybtUIeNAppWBP8
>>O
>>YeXZ2wiQkzAplSlfDw4ITeSXx0MDEla47QjArk-uAlgsRCv7i-i746Yt2NXxUgpMd2HHhTQqr
>>J
>>ZGTV9QWQlgSG0K47u2TK1xmxZNzdzjHK_wrJ07ilUKeIVsThyyA_Jm7wg4Qwwf1dwnkVnt6zY
>>E
>>h3Ze6kE&u=https%3A%2F%2Fgithub.com%2Fapache%2Fstorm%2Fblob%2F51c8474143b0
>>0
>>81ff0522b0367f3efdba2689089%2Fstorm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fsto
>>r
>>m%2Fdaemon%2Fnimbus%2FNimbus.java%23L2573-L2595>
>>
>>and it really would be mostly inserting a check
>>
>>probably after this line
>>
>>https://clicktime.symantec.com/a/1/KLseHXZon-yj2uKWnuvOIPZLuJ9329NAZDoWY5
>>_
>>gPgQ=?d=oSVAgk8qtKLjp8SsSKKdQGeGgwrYSL4YLBTdav_tfzjRGHdVjUPRdIC8uP3f_19HL
>>b
>>QP_DmLNtAZG97XkzvNRzrF0SE3L3kmm1F6S6RMFrhb6YXb8IB1VNtBHLb5glLccKrdvKNxAEq
>>P
>>HC7RTFNlxcw42TSI5In7DvC-ksZPivf17z1pQ61L8oEHZbbHJQ3nnzD96ILXL7qpLye-Yrp7L
>>0
>>lmoCqBAnAiaeiM3MXH_YN-ONcWqUxHEzsQE3TZI2W92lJOeYqCwKpp-2kSouqAvgnwrSquKgd
>>X
>>RmAbba8s4n-S4sNvE5KFLnZ9Lshhw70II-r9N4iEMNOfvFs6l90YrxoDwn8ZLL8_msDty9PSD
>>v
>>2-PiKdLrPYywp0XC4S8NJlYa4O6ZA6XiYEzkvNOI4MnQji69z8g8dtKNmymg4OFIf-gUmYqQz
>>l
>>2Ci1rFXghTNo7yENJyeoJxzcqz3azveiTlKB0-KPxw%3D%3D&u=https%3A%2F%2Fgithub.c
>>o
>>m%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0522b0367f3efdba2689089%2Fst
>>o
>>rm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fdaemon%2Fnimbus%2FNimbus.jav
>>a
>>%23L2583<https://clicktime.symantec.com/a/1/vgKlrHDYwUXdyawdaYGs46xwhu532
>>9
>>aNEdr3WD7HDrE=?d=MS-UWN_vi-ZagJt0xU9IbvMa_Sn5eMGVNdZjVChgjfQhVEoPckk5VmUO
>>2
>>oDaDYkTCwElmne6IQyPHIs9Xsx084v1kwUi12v19jPqFi2LdGRZlDEGKeq1Gmvap2me3KrdZ8
>>X
>>Qlgz9QYP6tI9JZQWmvDxUG05nMBb-jaYIpO18xE0MHYoRK2-h_USW97P7EB7pfMIvXZl6w-WW
>>J
>>dIDa9H2Eyc1tV1KXa86gDMqWmas7nf3C5nXp1-PHc6f6iQ3IwxC5aybtUIeNAppWBP8OYeXZ2
>>w
>>iQkzAplSlfDw4ITeSXx0MDEla47QjArk-uAlgsRCv7i-i746Yt2NXxUgpMd2HHhTQqrJZGTV9
>>Q
>>WQlgSG0K47u2TK1xmxZNzdzjHK_wrJ07ilUKeIVsThyyA_Jm7wg4Qwwf1dwnkVnt6zYEh3Ze6
>>k
>>E&u=https%3A%2F%2Fgithub.com%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0
>>5
>>22b0367f3efdba2689089%2Fstorm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fd
>>a
>>emon%2Fnimbus%2FNimbus.java%23L2583>
>>
>>to be sure the waitAmount is >= the configured minimum.
>>
>>
>>- Bobby
>>
>>
>>On Tuesday, January 17, 2017, 1:28:32 AM CST, Sachin Pasalkar
>><Sa...@symantec.com>>
>>wrote:
>>Currently user can kill topology directly without waiting for some amount
>>of time so that all inflight messages will get processed.  For example,
>>storm is writing to file & user kills topology, file is not closed or
>>moved to proper location. We need to educate operation guys to do the
>>right things also there are some chances that it will be not followed
>>causing system to go in inconsistent state.
>>
>>Can we set mandatory timeout (configurable) when user kills storm
>>topology? User should not be allowed kill topology with time less than
>>mentioned time.
>>
>>Some case:
>>1) If topology is long running don't allow user to kill but time not less
>>than mentioned one
>>2) If topology is just deployed allow him to kill instantly (as it might
>>be some mistake)
>>3) Handle same cases from command-line.
>>
>>Thanks,
>>Sachin
>


Re: Stop user from killing topology before X (configured) amount of time

Posted by Sachin Pasalkar <Sa...@symantec.com>.
Can you point out the branch which I need to checkout? I don¹t see below
Nimbus class in 1.x branch

On 17/01/17, 10:35 PM, "Sachin Pasalkar" <Sa...@symantec.com>
wrote:

>Hi Bobby,
>
>Thanks for the response. I have created JIRA
>https://issues.apache.org/jira/browse/STORM-2299. I will try to take look
>at it. I may ask some information if needed.
>
>Regards,
>Sachin
>
>From: Bobby Evans <ev...@yahoo-inc.com>>
>Date: Tuesday, 17 January 2017 at 9:16 PM
>To: Sachin Pasalkar
><sa...@symantec.com>>,
>"dev@storm.apache.org<ma...@storm.apache.org>"
><de...@storm.apache.org>>
>Subject: Re: Stop user from killing topology before X (configured) amount
>of time
>
>I would like to add that it would be good to have an admin override on
>this.  If someone accidentally makes the wait time 100 mins instead of
>100 seconds, it would be good to have an admin be able to really truly
>kill it faster.
>
>
>- Bobby
>
>
>On Tuesday, January 17, 2017, 9:44:08 AM CST, Bobby Evans
><ev...@yahoo-inc.com>> wrote:
>In order to kill a topology with no wait period the operator needs to
>supply extra arguments `-w 0`  or the code needs to be a few lines longer
>to pass in the KillOptions with a 0 timeout.  If you want a configured
>minimum timeout for a given topology I think that would be perfectly
>fine.  We do not currently support that, but please file a JIRA and
>hopefully someone can take a look at supporting it.  You can probably do
>a lot of the work yourself if you want to.
>
>The function you care about is here
>
>https://clicktime.symantec.com/a/1/2fQ9x0Ghmi7zyKyqEgy_eMCiyDksen0wXowwyW8
>raeU=?d=oSVAgk8qtKLjp8SsSKKdQGeGgwrYSL4YLBTdav_tfzjRGHdVjUPRdIC8uP3f_19HLb
>QP_DmLNtAZG97XkzvNRzrF0SE3L3kmm1F6S6RMFrhb6YXb8IB1VNtBHLb5glLccKrdvKNxAEqP
>HC7RTFNlxcw42TSI5In7DvC-ksZPivf17z1pQ61L8oEHZbbHJQ3nnzD96ILXL7qpLye-Yrp7L0
>lmoCqBAnAiaeiM3MXH_YN-ONcWqUxHEzsQE3TZI2W92lJOeYqCwKpp-2kSouqAvgnwrSquKgdX
>RmAbba8s4n-S4sNvE5KFLnZ9Lshhw70II-r9N4iEMNOfvFs6l90YrxoDwn8ZLL8_msDty9PSDv
>2-PiKdLrPYywp0XC4S8NJlYa4O6ZA6XiYEzkvNOI4MnQji69z8g8dtKNmymg4OFIf-gUmYqQzl
>2Ci1rFXghTNo7yENJyeoJxzcqz3azveiTlKB0-KPxw%3D%3D&u=https%3A%2F%2Fgithub.co
>m%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0522b0367f3efdba2689089%2Fsto
>rm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fdaemon%2Fnimbus%2FNimbus.java
>%23L2573-L2595<https://clicktime.symantec.com/a/1/hMFfR0x0xhEPmnLkHrzWh1uM
>1FSHoBzqUa0fgE8KPhI=?d=MS-UWN_vi-ZagJt0xU9IbvMa_Sn5eMGVNdZjVChgjfQhVEoPckk
>5VmUO2oDaDYkTCwElmne6IQyPHIs9Xsx084v1kwUi12v19jPqFi2LdGRZlDEGKeq1Gmvap2me3
>KrdZ8XQlgz9QYP6tI9JZQWmvDxUG05nMBb-jaYIpO18xE0MHYoRK2-h_USW97P7EB7pfMIvXZl
>6w-WWJdIDa9H2Eyc1tV1KXa86gDMqWmas7nf3C5nXp1-PHc6f6iQ3IwxC5aybtUIeNAppWBP8O
>YeXZ2wiQkzAplSlfDw4ITeSXx0MDEla47QjArk-uAlgsRCv7i-i746Yt2NXxUgpMd2HHhTQqrJ
>ZGTV9QWQlgSG0K47u2TK1xmxZNzdzjHK_wrJ07ilUKeIVsThyyA_Jm7wg4Qwwf1dwnkVnt6zYE
>h3Ze6kE&u=https%3A%2F%2Fgithub.com%2Fapache%2Fstorm%2Fblob%2F51c8474143b00
>81ff0522b0367f3efdba2689089%2Fstorm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstor
>m%2Fdaemon%2Fnimbus%2FNimbus.java%23L2573-L2595>
>
>and it really would be mostly inserting a check
>
>probably after this line
>
>https://clicktime.symantec.com/a/1/KLseHXZon-yj2uKWnuvOIPZLuJ9329NAZDoWY5_
>gPgQ=?d=oSVAgk8qtKLjp8SsSKKdQGeGgwrYSL4YLBTdav_tfzjRGHdVjUPRdIC8uP3f_19HLb
>QP_DmLNtAZG97XkzvNRzrF0SE3L3kmm1F6S6RMFrhb6YXb8IB1VNtBHLb5glLccKrdvKNxAEqP
>HC7RTFNlxcw42TSI5In7DvC-ksZPivf17z1pQ61L8oEHZbbHJQ3nnzD96ILXL7qpLye-Yrp7L0
>lmoCqBAnAiaeiM3MXH_YN-ONcWqUxHEzsQE3TZI2W92lJOeYqCwKpp-2kSouqAvgnwrSquKgdX
>RmAbba8s4n-S4sNvE5KFLnZ9Lshhw70II-r9N4iEMNOfvFs6l90YrxoDwn8ZLL8_msDty9PSDv
>2-PiKdLrPYywp0XC4S8NJlYa4O6ZA6XiYEzkvNOI4MnQji69z8g8dtKNmymg4OFIf-gUmYqQzl
>2Ci1rFXghTNo7yENJyeoJxzcqz3azveiTlKB0-KPxw%3D%3D&u=https%3A%2F%2Fgithub.co
>m%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0522b0367f3efdba2689089%2Fsto
>rm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fdaemon%2Fnimbus%2FNimbus.java
>%23L2583<https://clicktime.symantec.com/a/1/vgKlrHDYwUXdyawdaYGs46xwhu5329
>aNEdr3WD7HDrE=?d=MS-UWN_vi-ZagJt0xU9IbvMa_Sn5eMGVNdZjVChgjfQhVEoPckk5VmUO2
>oDaDYkTCwElmne6IQyPHIs9Xsx084v1kwUi12v19jPqFi2LdGRZlDEGKeq1Gmvap2me3KrdZ8X
>Qlgz9QYP6tI9JZQWmvDxUG05nMBb-jaYIpO18xE0MHYoRK2-h_USW97P7EB7pfMIvXZl6w-WWJ
>dIDa9H2Eyc1tV1KXa86gDMqWmas7nf3C5nXp1-PHc6f6iQ3IwxC5aybtUIeNAppWBP8OYeXZ2w
>iQkzAplSlfDw4ITeSXx0MDEla47QjArk-uAlgsRCv7i-i746Yt2NXxUgpMd2HHhTQqrJZGTV9Q
>WQlgSG0K47u2TK1xmxZNzdzjHK_wrJ07ilUKeIVsThyyA_Jm7wg4Qwwf1dwnkVnt6zYEh3Ze6k
>E&u=https%3A%2F%2Fgithub.com%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff05
>22b0367f3efdba2689089%2Fstorm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fda
>emon%2Fnimbus%2FNimbus.java%23L2583>
>
>to be sure the waitAmount is >= the configured minimum.
>
>
>- Bobby
>
>
>On Tuesday, January 17, 2017, 1:28:32 AM CST, Sachin Pasalkar
><Sa...@symantec.com>> wrote:
>Currently user can kill topology directly without waiting for some amount
>of time so that all inflight messages will get processed.  For example,
>storm is writing to file & user kills topology, file is not closed or
>moved to proper location. We need to educate operation guys to do the
>right things also there are some chances that it will be not followed
>causing system to go in inconsistent state.
>
>Can we set mandatory timeout (configurable) when user kills storm
>topology? User should not be allowed kill topology with time less than
>mentioned time.
>
>Some case:
>1) If topology is long running don't allow user to kill but time not less
>than mentioned one
>2) If topology is just deployed allow him to kill instantly (as it might
>be some mistake)
>3) Handle same cases from command-line.
>
>Thanks,
>Sachin


Re: Stop user from killing topology before X (configured) amount of time

Posted by Sachin Pasalkar <Sa...@symantec.com>.
Hi Bobby,

Thanks for the response. I have created JIRA https://issues.apache.org/jira/browse/STORM-2299. I will try to take look at it. I may ask some information if needed.

Regards,
Sachin

From: Bobby Evans <ev...@yahoo-inc.com>>
Date: Tuesday, 17 January 2017 at 9:16 PM
To: Sachin Pasalkar <sa...@symantec.com>>, "dev@storm.apache.org<ma...@storm.apache.org>" <de...@storm.apache.org>>
Subject: Re: Stop user from killing topology before X (configured) amount of time

I would like to add that it would be good to have an admin override on this.  If someone accidentally makes the wait time 100 mins instead of 100 seconds, it would be good to have an admin be able to really truly kill it faster.


- Bobby


On Tuesday, January 17, 2017, 9:44:08 AM CST, Bobby Evans <ev...@yahoo-inc.com>> wrote:
In order to kill a topology with no wait period the operator needs to supply extra arguments `-w 0`  or the code needs to be a few lines longer to pass in the KillOptions with a 0 timeout.  If you want a configured minimum timeout for a given topology I think that would be perfectly fine.  We do not currently support that, but please file a JIRA and hopefully someone can take a look at supporting it.  You can probably do a lot of the work yourself if you want to.

The function you care about is here

https://github.com/apache/storm/blob/51c8474143b0081ff0522b0367f3efdba2689089/storm-core/src/jvm/org/apache/storm/daemon/nimbus/Nimbus.java#L2573-L2595<https://clicktime.symantec.com/a/1/hMFfR0x0xhEPmnLkHrzWh1uM1FSHoBzqUa0fgE8KPhI=?d=MS-UWN_vi-ZagJt0xU9IbvMa_Sn5eMGVNdZjVChgjfQhVEoPckk5VmUO2oDaDYkTCwElmne6IQyPHIs9Xsx084v1kwUi12v19jPqFi2LdGRZlDEGKeq1Gmvap2me3KrdZ8XQlgz9QYP6tI9JZQWmvDxUG05nMBb-jaYIpO18xE0MHYoRK2-h_USW97P7EB7pfMIvXZl6w-WWJdIDa9H2Eyc1tV1KXa86gDMqWmas7nf3C5nXp1-PHc6f6iQ3IwxC5aybtUIeNAppWBP8OYeXZ2wiQkzAplSlfDw4ITeSXx0MDEla47QjArk-uAlgsRCv7i-i746Yt2NXxUgpMd2HHhTQqrJZGTV9QWQlgSG0K47u2TK1xmxZNzdzjHK_wrJ07ilUKeIVsThyyA_Jm7wg4Qwwf1dwnkVnt6zYEh3Ze6kE&u=https%3A%2F%2Fgithub.com%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0522b0367f3efdba2689089%2Fstorm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fdaemon%2Fnimbus%2FNimbus.java%23L2573-L2595>

and it really would be mostly inserting a check

probably after this line

https://github.com/apache/storm/blob/51c8474143b0081ff0522b0367f3efdba2689089/storm-core/src/jvm/org/apache/storm/daemon/nimbus/Nimbus.java#L2583<https://clicktime.symantec.com/a/1/vgKlrHDYwUXdyawdaYGs46xwhu5329aNEdr3WD7HDrE=?d=MS-UWN_vi-ZagJt0xU9IbvMa_Sn5eMGVNdZjVChgjfQhVEoPckk5VmUO2oDaDYkTCwElmne6IQyPHIs9Xsx084v1kwUi12v19jPqFi2LdGRZlDEGKeq1Gmvap2me3KrdZ8XQlgz9QYP6tI9JZQWmvDxUG05nMBb-jaYIpO18xE0MHYoRK2-h_USW97P7EB7pfMIvXZl6w-WWJdIDa9H2Eyc1tV1KXa86gDMqWmas7nf3C5nXp1-PHc6f6iQ3IwxC5aybtUIeNAppWBP8OYeXZ2wiQkzAplSlfDw4ITeSXx0MDEla47QjArk-uAlgsRCv7i-i746Yt2NXxUgpMd2HHhTQqrJZGTV9QWQlgSG0K47u2TK1xmxZNzdzjHK_wrJ07ilUKeIVsThyyA_Jm7wg4Qwwf1dwnkVnt6zYEh3Ze6kE&u=https%3A%2F%2Fgithub.com%2Fapache%2Fstorm%2Fblob%2F51c8474143b0081ff0522b0367f3efdba2689089%2Fstorm-core%2Fsrc%2Fjvm%2Forg%2Fapache%2Fstorm%2Fdaemon%2Fnimbus%2FNimbus.java%23L2583>

to be sure the waitAmount is >= the configured minimum.


- Bobby


On Tuesday, January 17, 2017, 1:28:32 AM CST, Sachin Pasalkar <Sa...@symantec.com>> wrote:
Currently user can kill topology directly without waiting for some amount of time so that all inflight messages will get processed.  For example, storm is writing to file & user kills topology, file is not closed or moved to proper location. We need to educate operation guys to do the right things also there are some chances that it will be not followed causing system to go in inconsistent state.

Can we set mandatory timeout (configurable) when user kills storm topology? User should not be allowed kill topology with time less than mentioned time.

Some case:
1) If topology is long running don't allow user to kill but time not less than mentioned one
2) If topology is just deployed allow him to kill instantly (as it might be some mistake)
3) Handle same cases from command-line.

Thanks,
Sachin

Re: Stop user from killing topology before X (configured) amount of time

Posted by Bobby Evans <ev...@yahoo-inc.com.INVALID>.
I would like to add that it would be good to have an admin override on this.  If someone accidentally makes the wait time 100 mins instead of 100 seconds, it would be good to have an admin be able to really truly kill it faster.


- Bobby

On Tuesday, January 17, 2017, 9:44:08 AM CST, Bobby Evans <ev...@yahoo-inc.com> wrote:In order to kill a topology with no wait period the operator needs to supply extra arguments `-w 0`  or the code needs to be a few lines longer to pass in the KillOptions with a 0 timeout.  If you want a configured minimum timeout for a given topology I think that would be perfectly fine.  We do not currently support that, but please file a JIRA and hopefully someone can take a look at supporting it.  You can probably do a lot of the work yourself if you want to.
The function you care about is here
https://github.com/apache/storm/blob/51c8474143b0081ff0522b0367f3efdba2689089/storm-core/src/jvm/org/apache/storm/daemon/nimbus/Nimbus.java#L2573-L2595
and it really would be mostly inserting a check 
probably after this line
https://github.com/apache/storm/blob/51c8474143b0081ff0522b0367f3efdba2689089/storm-core/src/jvm/org/apache/storm/daemon/nimbus/Nimbus.java#L2583
to be sure the waitAmount is >= the configured minimum.


- Bobby

On Tuesday, January 17, 2017, 1:28:32 AM CST, Sachin Pasalkar <Sa...@symantec.com> wrote:Currently user can kill topology directly without waiting for some amount of time so that all inflight messages will get processed.  For example, storm is writing to file & user kills topology, file is not closed or moved to proper location. We need to educate operation guys to do the right things also there are some chances that it will be not followed causing system to go in inconsistent state.

Can we set mandatory timeout (configurable) when user kills storm topology? User should not be allowed kill topology with time less than mentioned time.

Some case:
1) If topology is long running don't allow user to kill but time not less than mentioned one
2) If topology is just deployed allow him to kill instantly (as it might be some mistake)
3) Handle same cases from command-line.

Thanks,
Sachin

Re: Stop user from killing topology before X (configured) amount of time

Posted by Bobby Evans <ev...@yahoo-inc.com.INVALID>.
In order to kill a topology with no wait period the operator needs to supply extra arguments `-w 0`  or the code needs to be a few lines longer to pass in the KillOptions with a 0 timeout.  If you want a configured minimum timeout for a given topology I think that would be perfectly fine.  We do not currently support that, but please file a JIRA and hopefully someone can take a look at supporting it.  You can probably do a lot of the work yourself if you want to.
The function you care about is here
https://github.com/apache/storm/blob/51c8474143b0081ff0522b0367f3efdba2689089/storm-core/src/jvm/org/apache/storm/daemon/nimbus/Nimbus.java#L2573-L2595
and it really would be mostly inserting a check 
probably after this line
https://github.com/apache/storm/blob/51c8474143b0081ff0522b0367f3efdba2689089/storm-core/src/jvm/org/apache/storm/daemon/nimbus/Nimbus.java#L2583
to be sure the waitAmount is >= the configured minimum.


- Bobby

On Tuesday, January 17, 2017, 1:28:32 AM CST, Sachin Pasalkar <Sa...@symantec.com> wrote:Currently user can kill topology directly without waiting for some amount of time so that all inflight messages will get processed.  For example, storm is writing to file & user kills topology, file is not closed or moved to proper location. We need to educate operation guys to do the right things also there are some chances that it will be not followed causing system to go in inconsistent state.

Can we set mandatory timeout (configurable) when user kills storm topology? User should not be allowed kill topology with time less than mentioned time.

Some case:
1) If topology is long running don't allow user to kill but time not less than mentioned one
2) If topology is just deployed allow him to kill instantly (as it might be some mistake)
3) Handle same cases from command-line.

Thanks,
Sachin