You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@slider.apache.org by Rui Zhang <rz...@vertica.com> on 2014/07/24 21:26:31 UTC

Start a role always IN_PROGRESS status

Hi,

I can start my own created package now. But it always stay IN_PROGRESS 
status according to the log and after a long time it shows Failed. 
However, my application runs perfectly without any errors.

Why is this? How to determine the start is completed or not?

Thanks.

-- 
Rui Zhang
Software engineer Intern
Vertica, an HP Company
rzhang@vertica.com


Re: Start a role always IN_PROGRESS status

Posted by Rui Zhang <rz...@vertica.com>.
Thanks! It works. I should look at your sample more carefully.

My start implementation is a little bit complex. I will first bootstrap 
the catalog. Vertica uses spread to communicate with each other. So I 
also need to update the configuration file of spread to include all the 
hostnames Vertica will run on. After one node(I assign it as master) is 
on, I will then do some node creation work. Then I can start all the 
vertica nodes. All the work is easy through Vertica's admintool.

But it becomes difficult in the Yarn environment because I need to 
coordinate every node to avoid duplicate node name, duplicate catalog 
path if more than one vertica nodes run on the same machine. And I also 
need to remember each node's hostname to update the configuration file. 
Now I uses ZooKeeper to do all the coordination and store hostnames. 
Hope Slider will support more similar functionality in future releases.

Thanks.

Rui

On 07/24/2014 05:23 PM, Sumit Mohanty wrote:
> If you are using the helpers provided in the resource management library
> then you can set the wait_for_finish to False.
>
> Execute(process_cmd,
>          user=params.app_user,
>          logoutput=False,
>          *wait_for_finish=False*
>      )
>
> See the memecached sample I shared.
>
> How does your start() implementation look like?
>
> -Sumit
>
>
> On Thu, Jul 24, 2014 at 1:56 PM, Rui Zhang <rz...@vertica.com> wrote:
>
>> Actually Vertica is not killed by the script and run perfectly. But Slider
>> will think that Vertica is killed and try again.
>>
>>
>> On 07/24/2014 04:52 PM, Rui Zhang wrote:
>>
>>> Hi,
>>>
>>> thanks for your reply.
>>>
>>> I know why it hangs. I have read the code of agent/PythonExecutor.py
>>>
>>> It blocks in the line "process.communicate()".
>>>
>>> My start command will spawn three child processes. Maybe this line is
>>> waiting for the three to finish? I am not very familiar with how the
>>> communicate function works. But Vertica is a long-running process, it is
>>> not possible for Vertica to stop. So it always timeout and get killed by
>>> the watch-dog-thread.
>>>
>>> Is there a way to avoid the waiting?
>>>
>>> Thanks.
>>>
>>> On 07/24/2014 04:15 PM, Sumit Mohanty wrote:
>>>
>>>> My guess is that start() does not complete or fails and gets retried few
>>>> times and then eventually fails. Can you share the agent logs?
>>>>
>>>> We have a new release now 0.40 - can you port your package over to 0.40?
>>>> If
>>>> you can share your package I can help you do that. In any case, only
>>>> critical change is the metainfo.xml structure where its enclosed within
>>>> "<application></application>".
>>>>
>>>> http://slider.incubator.apache.org/docs/slider_specs/
>>>> hello_world_slider_app.html
>>>> is a work-in-progress doc for creating an application package which will
>>>> provide you the details.
>>>>
>>>> -Sumit
>>>>
>>>>
>>>> On Thu, Jul 24, 2014 at 12:26 PM, Rui Zhang <rz...@vertica.com> wrote:
>>>>
>>>>   Hi,
>>>>> I can start my own created package now. But it always stay IN_PROGRESS
>>>>> status according to the log and after a long time it shows Failed.
>>>>> However,
>>>>> my application runs perfectly without any errors.
>>>>>
>>>>> Why is this? How to determine the start is completed or not?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> --
>>>>> Rui Zhang
>>>>> Software engineer Intern
>>>>> Vertica, an HP Company
>>>>> rzhang@vertica.com
>>>>>
>>>>>
>>>>>
>> --
>> Rui Zhang
>> Software engineer Intern
>> Vertica, an HP Company
>> rzhang@vertica.com
>>
>>

-- 
Rui Zhang
Software engineer Intern
Vertica, an HP Company
rzhang@vertica.com


Re: Start a role always IN_PROGRESS status

Posted by Sumit Mohanty <sm...@hortonworks.com>.
If you are using the helpers provided in the resource management library
then you can set the wait_for_finish to False.

Execute(process_cmd,
        user=params.app_user,
        logoutput=False,
        *wait_for_finish=False*
    )

See the memecached sample I shared.

How does your start() implementation look like?

-Sumit


On Thu, Jul 24, 2014 at 1:56 PM, Rui Zhang <rz...@vertica.com> wrote:

> Actually Vertica is not killed by the script and run perfectly. But Slider
> will think that Vertica is killed and try again.
>
>
> On 07/24/2014 04:52 PM, Rui Zhang wrote:
>
>> Hi,
>>
>> thanks for your reply.
>>
>> I know why it hangs. I have read the code of agent/PythonExecutor.py
>>
>> It blocks in the line "process.communicate()".
>>
>> My start command will spawn three child processes. Maybe this line is
>> waiting for the three to finish? I am not very familiar with how the
>> communicate function works. But Vertica is a long-running process, it is
>> not possible for Vertica to stop. So it always timeout and get killed by
>> the watch-dog-thread.
>>
>> Is there a way to avoid the waiting?
>>
>> Thanks.
>>
>> On 07/24/2014 04:15 PM, Sumit Mohanty wrote:
>>
>>> My guess is that start() does not complete or fails and gets retried few
>>> times and then eventually fails. Can you share the agent logs?
>>>
>>> We have a new release now 0.40 - can you port your package over to 0.40?
>>> If
>>> you can share your package I can help you do that. In any case, only
>>> critical change is the metainfo.xml structure where its enclosed within
>>> "<application></application>".
>>>
>>> http://slider.incubator.apache.org/docs/slider_specs/
>>> hello_world_slider_app.html
>>> is a work-in-progress doc for creating an application package which will
>>> provide you the details.
>>>
>>> -Sumit
>>>
>>>
>>> On Thu, Jul 24, 2014 at 12:26 PM, Rui Zhang <rz...@vertica.com> wrote:
>>>
>>>  Hi,
>>>>
>>>> I can start my own created package now. But it always stay IN_PROGRESS
>>>> status according to the log and after a long time it shows Failed.
>>>> However,
>>>> my application runs perfectly without any errors.
>>>>
>>>> Why is this? How to determine the start is completed or not?
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> Rui Zhang
>>>> Software engineer Intern
>>>> Vertica, an HP Company
>>>> rzhang@vertica.com
>>>>
>>>>
>>>>
>>
> --
> Rui Zhang
> Software engineer Intern
> Vertica, an HP Company
> rzhang@vertica.com
>
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Re: Start a role always IN_PROGRESS status

Posted by Rui Zhang <rz...@vertica.com>.
Actually Vertica is not killed by the script and run perfectly. But 
Slider will think that Vertica is killed and try again.

On 07/24/2014 04:52 PM, Rui Zhang wrote:
> Hi,
>
> thanks for your reply.
>
> I know why it hangs. I have read the code of agent/PythonExecutor.py
>
> It blocks in the line "process.communicate()".
>
> My start command will spawn three child processes. Maybe this line is 
> waiting for the three to finish? I am not very familiar with how the 
> communicate function works. But Vertica is a long-running process, it 
> is not possible for Vertica to stop. So it always timeout and get 
> killed by the watch-dog-thread.
>
> Is there a way to avoid the waiting?
>
> Thanks.
>
> On 07/24/2014 04:15 PM, Sumit Mohanty wrote:
>> My guess is that start() does not complete or fails and gets retried few
>> times and then eventually fails. Can you share the agent logs?
>>
>> We have a new release now 0.40 - can you port your package over to 
>> 0.40? If
>> you can share your package I can help you do that. In any case, only
>> critical change is the metainfo.xml structure where its enclosed within
>> "<application></application>".
>>
>> http://slider.incubator.apache.org/docs/slider_specs/hello_world_slider_app.html 
>>
>> is a work-in-progress doc for creating an application package which will
>> provide you the details.
>>
>> -Sumit
>>
>>
>> On Thu, Jul 24, 2014 at 12:26 PM, Rui Zhang <rz...@vertica.com> wrote:
>>
>>> Hi,
>>>
>>> I can start my own created package now. But it always stay IN_PROGRESS
>>> status according to the log and after a long time it shows Failed. 
>>> However,
>>> my application runs perfectly without any errors.
>>>
>>> Why is this? How to determine the start is completed or not?
>>>
>>> Thanks.
>>>
>>> -- 
>>> Rui Zhang
>>> Software engineer Intern
>>> Vertica, an HP Company
>>> rzhang@vertica.com
>>>
>>>
>

-- 
Rui Zhang
Software engineer Intern
Vertica, an HP Company
rzhang@vertica.com


Re: Start a role always IN_PROGRESS status

Posted by Rui Zhang <rz...@vertica.com>.
Hi,

thanks for your reply.

I know why it hangs. I have read the code of agent/PythonExecutor.py

It blocks in the line "process.communicate()".

My start command will spawn three child processes. Maybe this line is 
waiting for the three to finish? I am not very familiar with how the 
communicate function works. But Vertica is a long-running process, it is 
not possible for Vertica to stop. So it always timeout and get killed by 
the watch-dog-thread.

Is there a way to avoid the waiting?

Thanks.

On 07/24/2014 04:15 PM, Sumit Mohanty wrote:
> My guess is that start() does not complete or fails and gets retried few
> times and then eventually fails. Can you share the agent logs?
>
> We have a new release now 0.40 - can you port your package over to 0.40? If
> you can share your package I can help you do that. In any case, only
> critical change is the metainfo.xml structure where its enclosed within
> "<application></application>".
>
> http://slider.incubator.apache.org/docs/slider_specs/hello_world_slider_app.html
> is a work-in-progress doc for creating an application package which will
> provide you the details.
>
> -Sumit
>
>
> On Thu, Jul 24, 2014 at 12:26 PM, Rui Zhang <rz...@vertica.com> wrote:
>
>> Hi,
>>
>> I can start my own created package now. But it always stay IN_PROGRESS
>> status according to the log and after a long time it shows Failed. However,
>> my application runs perfectly without any errors.
>>
>> Why is this? How to determine the start is completed or not?
>>
>> Thanks.
>>
>> --
>> Rui Zhang
>> Software engineer Intern
>> Vertica, an HP Company
>> rzhang@vertica.com
>>
>>

-- 
Rui Zhang
Software engineer Intern
Vertica, an HP Company
rzhang@vertica.com


Re: Start a role always IN_PROGRESS status

Posted by Sumit Mohanty <sm...@apache.org>.
My guess is that start() does not complete or fails and gets retried few
times and then eventually fails. Can you share the agent logs?

We have a new release now 0.40 - can you port your package over to 0.40? If
you can share your package I can help you do that. In any case, only
critical change is the metainfo.xml structure where its enclosed within
"<application></application>".

http://slider.incubator.apache.org/docs/slider_specs/hello_world_slider_app.html
is a work-in-progress doc for creating an application package which will
provide you the details.

-Sumit


On Thu, Jul 24, 2014 at 12:26 PM, Rui Zhang <rz...@vertica.com> wrote:

> Hi,
>
> I can start my own created package now. But it always stay IN_PROGRESS
> status according to the log and after a long time it shows Failed. However,
> my application runs perfectly without any errors.
>
> Why is this? How to determine the start is completed or not?
>
> Thanks.
>
> --
> Rui Zhang
> Software engineer Intern
> Vertica, an HP Company
> rzhang@vertica.com
>
>