You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@storm.apache.org by Eranga Heshan <er...@gmail.com> on 2016/12/14 13:36:02 UTC

Strange Behavior of Storm Worker

Hi all,

I recently ran a topology which consists of 5 workers on 4 node cluster.
Every worker has the below configuration parameter set.

worker.childopts: "-Xms1500m"

When the topology is submitted, I checked for each worker's behavior and
found out that one worker (runs alone in one node) keeps restarting.

It actually doesn't affect the process because the restarted worker does
the same job like the previous. But I am curious to know what exactly is
happening to the worker to get restarted.

I checked the free memory of that particular worker's node continuously and
found out that it gets restarted even it has enough memory left (more than
1GB). However, there might be many events buffered to be processed by that
worker since the spout is producing events at a much higher rate.

Given the above details can anyone please give me a clarification on what
would be happening to the worker?

Thanks,
Regards,





Eranga Heshan
*Undergraduate*
Computer Science & Engineering
University of Moratuwa
Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
Email: eranga@wso2.com <os...@gmail.com>
<https://www.facebook.com/erangaheshan>   <https://twitter.com/erangaheshan>
   <https://www.linkedin.com/in/erangaheshan>

Re: Strange Behavior of Storm Worker

Posted by Erik Weathers <ew...@groupon.com>.

Workers do not restart on their own.  There *must* be supervisor logs about
starting workers and noticing workers are dead.  Look through the
supervisor logs, don't just search for errors.

On Thu, Dec 15, 2016 at 2:13 AM Mostafa Gomaa <mg...@trendak.com> wrote:

> That STDERR is most probably a dependency issue. You are probably using
> multiple external libraries that include conflicting logging libraries as
> dependencies. As for the changing PID, I am not too sure about that, but I
> am interested to see if someone else can shed some light on that.
>
> On Thu, Dec 15, 2016 at 11:57 AM, Eranga Heshan <er...@gmail.com>
> wrote:
>
> I am using storm-1.0.2.
>
> I checked all the logs in STORM_DIR\logs. I found no [ERROR] logs printed
> only [INFO] logs. I saw STDERR logs as [INFO]. For an example,
>
> STDERR [INFO] SLF4J: Class path contains multiple SLF4J bindings.
>
> There was no clear log written to identify that the worker was killed. As
> I mentioned before, I am not sure actually what happened to the worker. All
> I observed was that the worker changed its PID while the topology was still
> running. But ultimately, my topology ran fine and also produced the desired
> output.
>
> Are you sure that there is an issue to be fixed?
>
> Thanks,
> Regards,
>
>
>
> Eranga Heshan
> *Undergraduate*
> Computer Science & Engineering
> University of Moratuwa
> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
> Email: eranga@wso2.com <os...@gmail.com>
> <https://www.facebook.com/erangaheshan>
> <https://twitter.com/erangaheshan>
> <https://www.linkedin.com/in/erangaheshan>
>
>
>
> On Thu, Dec 15, 2016 at 8:27 AM, Erik Weathers <ew...@groupon.com>
> wrote:
>
> What version of storm are you running?  The newer ones (I believe 0.10+)
> record stdout/stderr from workers via a wrapping "LogWriter" process.  If
> your worker process is dying as you say it is, there *will* be logs in 1 or
> more of these places:
>
>    - supervisor logs
>    - worker logs
>    - worker stderr/stdout logs
>
> You should figure out why it's dying and fix whatever that issue is.
>
> - Erik
>
> On Wed, Dec 14, 2016 at 6:44 PM, Eranga Heshan <er...@gmail.com>
> wrote:
>
> I checked the log files and there are no errors logged.
>
> While running the topology I checked that log directory. Although the
> worker gets restarted, it writes the log to the same file as long as the
> new worker runs on the same port (port 6704). In my case, after a while, it
> selects another port (port 6700). Then it writes a new log. (Log directory
> is named after the port number)
>
> I would like to know if this is a normal behavior of storm worker. Because
> this scenario does not affect the topology process.
>
> Thanks,
> Regards,
>
>
> Eranga Heshan
> *Undergraduate*
> Computer Science & Engineering
> University of Moratuwa
> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
> Email: eranga@wso2.com <os...@gmail.com>
> <https://www.facebook.com/erangaheshan>
> <https://twitter.com/erangaheshan>
> <https://www.linkedin.com/in/erangaheshan>
>
>
>
> On Wed, Dec 14, 2016 at 7:08 PM, Mostafa Gomaa <mg...@trendak.com> wrote:
>
> I would check the log file for that worker. You can find it
> storm/logs/workers-artifacts/topology_id. check if there are any errors
> that are causing the worker to restart.
>
> On Wed, Dec 14, 2016 at 3:36 PM, Eranga Heshan <er...@gmail.com>
> wrote:
>
> Hi all,
>
> I recently ran a topology which consists of 5 workers on 4 node cluster.
> Every worker has the below configuration parameter set.
>
> worker.childopts: "-Xms1500m"
>
> When the topology is submitted, I checked for each worker's behavior and
> found out that one worker (runs alone in one node) keeps restarting.
>
> It actually doesn't affect the process because the restarted worker does
> the same job like the previous. But I am curious to know what exactly is
> happening to the worker to get restarted.
>
> I checked the free memory of that particular worker's node continuously
> and found out that it gets restarted even it has enough memory left (more
> than 1GB). However, there might be many events buffered to be processed by
> that worker since the spout is producing events at a much higher rate.
>
> Given the above details can anyone please give me a clarification on what
> would be happening to the worker?
>
> Thanks,
> Regards,
>
>
>
>
>
> Eranga Heshan
> *Undergraduate*
> Computer Science & Engineering
> University of Moratuwa
> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
> Email: eranga@wso2.com <os...@gmail.com>
> <https://www.facebook.com/erangaheshan>
> <https://twitter.com/erangaheshan>
> <https://www.linkedin.com/in/erangaheshan>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: Strange Behavior of Storm Worker

Posted by Mostafa Gomaa <mg...@trendak.com>.

That STDERR is most probably a dependency issue. You are probably using
multiple external libraries that include conflicting logging libraries as
dependencies. As for the changing PID, I am not too sure about that, but I
am interested to see if someone else can shed some light on that.

On Thu, Dec 15, 2016 at 11:57 AM, Eranga Heshan <er...@gmail.com>
wrote:

> I am using storm-1.0.2.
>
> I checked all the logs in STORM_DIR\logs. I found no [ERROR] logs printed
> only [INFO] logs. I saw STDERR logs as [INFO]. For an example,
>
> STDERR [INFO] SLF4J: Class path contains multiple SLF4J bindings.
>
> There was no clear log written to identify that the worker was killed. As
> I mentioned before, I am not sure actually what happened to the worker. All
> I observed was that the worker changed its PID while the topology was still
> running. But ultimately, my topology ran fine and also produced the desired
> output.
>
> Are you sure that there is an issue to be fixed?
>
> Thanks,
> Regards,
>
>
>
> Eranga Heshan
> *Undergraduate*
> Computer Science & Engineering
> University of Moratuwa
> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
> Email: eranga@wso2.com <os...@gmail.com>
> <https://www.facebook.com/erangaheshan>
> <https://twitter.com/erangaheshan>
> <https://www.linkedin.com/in/erangaheshan>
>
> On Thu, Dec 15, 2016 at 8:27 AM, Erik Weathers <ew...@groupon.com>
> wrote:
>
>> What version of storm are you running?  The newer ones (I believe 0.10+)
>> record stdout/stderr from workers via a wrapping "LogWriter" process.  If
>> your worker process is dying as you say it is, there *will* be logs in 1 or
>> more of these places:
>>
>>    - supervisor logs
>>    - worker logs
>>    - worker stderr/stdout logs
>>
>> You should figure out why it's dying and fix whatever that issue is.
>>
>> - Erik
>>
>> On Wed, Dec 14, 2016 at 6:44 PM, Eranga Heshan <er...@gmail.com>
>> wrote:
>>
>>> I checked the log files and there are no errors logged.
>>>
>>> While running the topology I checked that log directory. Although the
>>> worker gets restarted, it writes the log to the same file as long as the
>>> new worker runs on the same port (port 6704). In my case, after a while, it
>>> selects another port (port 6700). Then it writes a new log. (Log directory
>>> is named after the port number)
>>>
>>> I would like to know if this is a normal behavior of storm worker.
>>> Because this scenario does not affect the topology process.
>>>
>>> Thanks,
>>> Regards,
>>>
>>>
>>> Eranga Heshan
>>> *Undergraduate*
>>> Computer Science & Engineering
>>> University of Moratuwa
>>> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
>>> Email: eranga@wso2.com <os...@gmail.com>
>>> <https://www.facebook.com/erangaheshan>
>>> <https://twitter.com/erangaheshan>
>>> <https://www.linkedin.com/in/erangaheshan>
>>>
>>> On Wed, Dec 14, 2016 at 7:08 PM, Mostafa Gomaa <mg...@trendak.com>
>>> wrote:
>>>
>>>> I would check the log file for that worker. You can find it
>>>> storm/logs/workers-artifacts/topology_id. check if there are any
>>>> errors that are causing the worker to restart.
>>>>
>>>> On Wed, Dec 14, 2016 at 3:36 PM, Eranga Heshan <er...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I recently ran a topology which consists of 5 workers on 4 node
>>>>> cluster. Every worker has the below configuration parameter set.
>>>>>
>>>>> worker.childopts: "-Xms1500m"
>>>>>
>>>>> When the topology is submitted, I checked for each worker's behavior
>>>>> and found out that one worker (runs alone in one node) keeps restarting.
>>>>>
>>>>> It actually doesn't affect the process because the restarted worker
>>>>> does the same job like the previous. But I am curious to know what exactly
>>>>> is happening to the worker to get restarted.
>>>>>
>>>>> I checked the free memory of that particular worker's node
>>>>> continuously and found out that it gets restarted even it has enough memory
>>>>> left (more than 1GB). However, there might be many events buffered to be
>>>>> processed by that worker since the spout is producing events at a much
>>>>> higher rate.
>>>>>
>>>>> Given the above details can anyone please give me a clarification on
>>>>> what would be happening to the worker?
>>>>>
>>>>> Thanks,
>>>>> Regards,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Eranga Heshan
>>>>> *Undergraduate*
>>>>> Computer Science & Engineering
>>>>> University of Moratuwa
>>>>> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
>>>>> Email: eranga@wso2.com <os...@gmail.com>
>>>>> <https://www.facebook.com/erangaheshan>
>>>>> <https://twitter.com/erangaheshan>
>>>>> <https://www.linkedin.com/in/erangaheshan>
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Strange Behavior of Storm Worker

Posted by Eranga Heshan <er...@gmail.com>.

I am using storm-1.0.2.

I checked all the logs in STORM_DIR\logs. I found no [ERROR] logs printed
only [INFO] logs. I saw STDERR logs as [INFO]. For an example,

STDERR [INFO] SLF4J: Class path contains multiple SLF4J bindings.

There was no clear log written to identify that the worker was killed. As I
mentioned before, I am not sure actually what happened to the worker. All I
observed was that the worker changed its PID while the topology was still
running. But ultimately, my topology ran fine and also produced the desired
output.

Are you sure that there is an issue to be fixed?

Thanks,
Regards,



Eranga Heshan
*Undergraduate*
Computer Science & Engineering
University of Moratuwa
Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
Email: eranga@wso2.com <os...@gmail.com>
<https://www.facebook.com/erangaheshan>   <https://twitter.com/erangaheshan>
   <https://www.linkedin.com/in/erangaheshan>

On Thu, Dec 15, 2016 at 8:27 AM, Erik Weathers <ew...@groupon.com>
wrote:

> What version of storm are you running?  The newer ones (I believe 0.10+)
> record stdout/stderr from workers via a wrapping "LogWriter" process.  If
> your worker process is dying as you say it is, there *will* be logs in 1 or
> more of these places:
>
>    - supervisor logs
>    - worker logs
>    - worker stderr/stdout logs
>
> You should figure out why it's dying and fix whatever that issue is.
>
> - Erik
>
> On Wed, Dec 14, 2016 at 6:44 PM, Eranga Heshan <er...@gmail.com>
> wrote:
>
>> I checked the log files and there are no errors logged.
>>
>> While running the topology I checked that log directory. Although the
>> worker gets restarted, it writes the log to the same file as long as the
>> new worker runs on the same port (port 6704). In my case, after a while, it
>> selects another port (port 6700). Then it writes a new log. (Log directory
>> is named after the port number)
>>
>> I would like to know if this is a normal behavior of storm worker.
>> Because this scenario does not affect the topology process.
>>
>> Thanks,
>> Regards,
>>
>>
>> Eranga Heshan
>> *Undergraduate*
>> Computer Science & Engineering
>> University of Moratuwa
>> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
>> Email: eranga@wso2.com <os...@gmail.com>
>> <https://www.facebook.com/erangaheshan>
>> <https://twitter.com/erangaheshan>
>> <https://www.linkedin.com/in/erangaheshan>
>>
>> On Wed, Dec 14, 2016 at 7:08 PM, Mostafa Gomaa <mg...@trendak.com>
>> wrote:
>>
>>> I would check the log file for that worker. You can find it
>>> storm/logs/workers-artifacts/topology_id. check if there are any errors
>>> that are causing the worker to restart.
>>>
>>> On Wed, Dec 14, 2016 at 3:36 PM, Eranga Heshan <er...@gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I recently ran a topology which consists of 5 workers on 4 node
>>>> cluster. Every worker has the below configuration parameter set.
>>>>
>>>> worker.childopts: "-Xms1500m"
>>>>
>>>> When the topology is submitted, I checked for each worker's behavior
>>>> and found out that one worker (runs alone in one node) keeps restarting.
>>>>
>>>> It actually doesn't affect the process because the restarted worker
>>>> does the same job like the previous. But I am curious to know what exactly
>>>> is happening to the worker to get restarted.
>>>>
>>>> I checked the free memory of that particular worker's node continuously
>>>> and found out that it gets restarted even it has enough memory left (more
>>>> than 1GB). However, there might be many events buffered to be processed by
>>>> that worker since the spout is producing events at a much higher rate.
>>>>
>>>> Given the above details can anyone please give me a clarification on
>>>> what would be happening to the worker?
>>>>
>>>> Thanks,
>>>> Regards,
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Eranga Heshan
>>>> *Undergraduate*
>>>> Computer Science & Engineering
>>>> University of Moratuwa
>>>> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
>>>> Email: eranga@wso2.com <os...@gmail.com>
>>>> <https://www.facebook.com/erangaheshan>
>>>> <https://twitter.com/erangaheshan>
>>>> <https://www.linkedin.com/in/erangaheshan>
>>>>
>>>
>>>
>>
>

Re: Strange Behavior of Storm Worker

Posted by Erik Weathers <ew...@groupon.com>.

What version of storm are you running?  The newer ones (I believe 0.10+)
record stdout/stderr from workers via a wrapping "LogWriter" process.  If
your worker process is dying as you say it is, there *will* be logs in 1 or
more of these places:

   - supervisor logs
   - worker logs
   - worker stderr/stdout logs

You should figure out why it's dying and fix whatever that issue is.

- Erik

On Wed, Dec 14, 2016 at 6:44 PM, Eranga Heshan <er...@gmail.com> wrote:

> I checked the log files and there are no errors logged.
>
> While running the topology I checked that log directory. Although the
> worker gets restarted, it writes the log to the same file as long as the
> new worker runs on the same port (port 6704). In my case, after a while, it
> selects another port (port 6700). Then it writes a new log. (Log directory
> is named after the port number)
>
> I would like to know if this is a normal behavior of storm worker. Because
> this scenario does not affect the topology process.
>
> Thanks,
> Regards,
>
>
> Eranga Heshan
> *Undergraduate*
> Computer Science & Engineering
> University of Moratuwa
> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
> Email: eranga@wso2.com <os...@gmail.com>
> <https://www.facebook.com/erangaheshan>
> <https://twitter.com/erangaheshan>
> <https://www.linkedin.com/in/erangaheshan>
>
> On Wed, Dec 14, 2016 at 7:08 PM, Mostafa Gomaa <mg...@trendak.com> wrote:
>
>> I would check the log file for that worker. You can find it
>> storm/logs/workers-artifacts/topology_id. check if there are any errors
>> that are causing the worker to restart.
>>
>> On Wed, Dec 14, 2016 at 3:36 PM, Eranga Heshan <er...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I recently ran a topology which consists of 5 workers on 4 node cluster.
>>> Every worker has the below configuration parameter set.
>>>
>>> worker.childopts: "-Xms1500m"
>>>
>>> When the topology is submitted, I checked for each worker's behavior and
>>> found out that one worker (runs alone in one node) keeps restarting.
>>>
>>> It actually doesn't affect the process because the restarted worker does
>>> the same job like the previous. But I am curious to know what exactly is
>>> happening to the worker to get restarted.
>>>
>>> I checked the free memory of that particular worker's node continuously
>>> and found out that it gets restarted even it has enough memory left (more
>>> than 1GB). However, there might be many events buffered to be processed by
>>> that worker since the spout is producing events at a much higher rate.
>>>
>>> Given the above details can anyone please give me a clarification on
>>> what would be happening to the worker?
>>>
>>> Thanks,
>>> Regards,
>>>
>>>
>>>
>>>
>>>
>>> Eranga Heshan
>>> *Undergraduate*
>>> Computer Science & Engineering
>>> University of Moratuwa
>>> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
>>> Email: eranga@wso2.com <os...@gmail.com>
>>> <https://www.facebook.com/erangaheshan>
>>> <https://twitter.com/erangaheshan>
>>> <https://www.linkedin.com/in/erangaheshan>
>>>
>>
>>
>

Re: Strange Behavior of Storm Worker

Posted by Eranga Heshan <er...@gmail.com>.

I checked the log files and there are no errors logged.

While running the topology I checked that log directory. Although the
worker gets restarted, it writes the log to the same file as long as the
new worker runs on the same port (port 6704). In my case, after a while, it
selects another port (port 6700). Then it writes a new log. (Log directory
is named after the port number)

I would like to know if this is a normal behavior of storm worker. Because
this scenario does not affect the topology process.

Thanks,
Regards,


Eranga Heshan
*Undergraduate*
Computer Science & Engineering
University of Moratuwa
Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
Email: eranga@wso2.com <os...@gmail.com>
<https://www.facebook.com/erangaheshan>   <https://twitter.com/erangaheshan>
   <https://www.linkedin.com/in/erangaheshan>

On Wed, Dec 14, 2016 at 7:08 PM, Mostafa Gomaa <mg...@trendak.com> wrote:

> I would check the log file for that worker. You can find it
> storm/logs/workers-artifacts/topology_id. check if there are any errors
> that are causing the worker to restart.
>
> On Wed, Dec 14, 2016 at 3:36 PM, Eranga Heshan <er...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I recently ran a topology which consists of 5 workers on 4 node cluster.
>> Every worker has the below configuration parameter set.
>>
>> worker.childopts: "-Xms1500m"
>>
>> When the topology is submitted, I checked for each worker's behavior and
>> found out that one worker (runs alone in one node) keeps restarting.
>>
>> It actually doesn't affect the process because the restarted worker does
>> the same job like the previous. But I am curious to know what exactly is
>> happening to the worker to get restarted.
>>
>> I checked the free memory of that particular worker's node continuously
>> and found out that it gets restarted even it has enough memory left (more
>> than 1GB). However, there might be many events buffered to be processed by
>> that worker since the spout is producing events at a much higher rate.
>>
>> Given the above details can anyone please give me a clarification on what
>> would be happening to the worker?
>>
>> Thanks,
>> Regards,
>>
>>
>>
>>
>>
>> Eranga Heshan
>> *Undergraduate*
>> Computer Science & Engineering
>> University of Moratuwa
>> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
>> Email: eranga@wso2.com <os...@gmail.com>
>> <https://www.facebook.com/erangaheshan>
>> <https://twitter.com/erangaheshan>
>> <https://www.linkedin.com/in/erangaheshan>
>>
>
>

Re: Strange Behavior of Storm Worker

Posted by Mostafa Gomaa <mg...@trendak.com>.

I would check the log file for that worker. You can find it
storm/logs/workers-artifacts/topology_id. check if there are any errors
that are causing the worker to restart.

On Wed, Dec 14, 2016 at 3:36 PM, Eranga Heshan <er...@gmail.com> wrote:

> Hi all,
>
> I recently ran a topology which consists of 5 workers on 4 node cluster.
> Every worker has the below configuration parameter set.
>
> worker.childopts: "-Xms1500m"
>
> When the topology is submitted, I checked for each worker's behavior and
> found out that one worker (runs alone in one node) keeps restarting.
>
> It actually doesn't affect the process because the restarted worker does
> the same job like the previous. But I am curious to know what exactly is
> happening to the worker to get restarted.
>
> I checked the free memory of that particular worker's node continuously
> and found out that it gets restarted even it has enough memory left (more
> than 1GB). However, there might be many events buffered to be processed by
> that worker since the spout is producing events at a much higher rate.
>
> Given the above details can anyone please give me a clarification on what
> would be happening to the worker?
>
> Thanks,
> Regards,
>
>
>
>
>
> Eranga Heshan
> *Undergraduate*
> Computer Science & Engineering
> University of Moratuwa
> Mobile:  +94 71 138 2686 <%2B94%2071%20552%202087>
> Email: eranga@wso2.com <os...@gmail.com>
> <https://www.facebook.com/erangaheshan>
> <https://twitter.com/erangaheshan>
> <https://www.linkedin.com/in/erangaheshan>
>