You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Venkatesh B <ve...@gmail.com> on 2014/07/15 17:44:39 UTC

Flume custom source get BLOCKED automatically after some time

Hello,

I have written flume custom source and in custom source in start() method
I'm initiating a thread. This thread "data_thread" will collect data from
web and writes to channel processor.

My data collection thread is some thing like thread starts collecting data
from web and when data is over then sleeps for some time and then again
fetches data from web, this process is continuous. Here the control will be
inside the thread "data_thread" all the time, since I have used while(true)
inside thread logic.

Here thread "data_thread" is collecting data upto 7mins and then it stops
collecting data.
I implemented logging and there by I came to know that  thread
 "data_thread" state is "BLOCKED". And not returning for ever.
In my custom source I tried in both ways by synchronizing all methods and
also by non synchronizing all methods but no use.

This blocking occurs when I start flume agent from java program, but when I
start flume agent from shell it is working fine data is collected
continuously for days without any error.

Any idea on why thread is going to BLOCKED state??

And also I want to know about flume custom source environment

Thanking you
Venkatesh

Re: Flume custom source get BLOCKED automatically after some time

Posted by Venkatesh B <ve...@gmail.com>.
Hi Natty,

Thank you so much for letting me to know about jstack command.
Now the problem got solved.
In flume custom source thread execution is stopping at line
"System.out.println() or e.printStackTrace()" the statements which prints
on console.
After knowing this thing I removed all the console print statements it
worked then.
 I don't know what really happening behind print statements.
Here is the dump of the thread where it stops execution.

=============================================================================
Thread 11911: (state = IN_NATIVE)
 - java.io.FileOutputStream.writeBytes(byte[], int, int) @bci=0
(Interpreted frame)
 - java.io.FileOutputStream.write(byte[], int, int) @bci=4, line=282
(Interpreted frame)
 - java.io.BufferedOutputStream.flushBuffer() @bci=20, line=65 (Interpreted
frame)
 - java.io.BufferedOutputStream.flush() @bci=1, line=123 (Interpreted frame)
 - java.io.PrintStream.write(byte[], int, int) @bci=30, line=432 (Compiled
frame)
 - sun.nio.cs.StreamEncoder.writeBytes() @bci=120, line=202 (Compiled frame)
 - sun.nio.cs.StreamEncoder.implFlushBuffer() @bci=11, line=272 (Compiled
frame)
 - sun.nio.cs.StreamEncoder.flushBuffer() @bci=15, line=85 (Interpreted
frame)
 - java.io.OutputStreamWriter.flushBuffer() @bci=4, line=168 (Interpreted
frame)
 - java.io.PrintStream.write(java.lang.String) @bci=27, line=477
(Interpreted frame)
 - java.io.PrintStream.print(java.lang.String) @bci=9, line=619
(Interpreted frame)
 - java.io.PrintStream.println(java.lang.String) @bci=6, line=756
(Interpreted frame)
 -
org.kavayah.bigdata.flume.source.YoutubeHydrateFlumeSource2.writeComments(java.lang.String)
@bci=1311, line=750 (Interpreted frame)
 - org.kavayah.bigdata.flume.source.YoutubeHydrateFlumeSource2$3.run()
@bci=871, line=227 (Interpreted frame)
 - java.lang.Thread.run() @bci=11, line=662 (Interpreted frame)

===================================================================


Upto some time there is no problem with print statements but after sometime
getting error at these print statements.

Once again thanks a lot Natty.


Thanks,
Venkatesh.


On Tue, Jul 15, 2014 at 10:57 PM, Jonathan Natkins <na...@streamsets.com>
wrote:

> Hi Venkatesh,
>
> Does it reliably stop processing events after about 7 minutes, or does it
> happen randomly, and just quickly? Does the program immediately start up
> the Flume agent?
>
> Have you looked at a thread dump from the program, at all? You can use
> `jstack -F <pid>` to produce a stacktrace of all the threads in the
> program, which might have some useful information. Can you post a thread
> dump of the agent process that is spawned from your program?
>
> Thanks,
> Natty
>
>
> On Tue, Jul 15, 2014 at 8:44 AM, Venkatesh B <ve...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I have written flume custom source and in custom source in start() method
>> I'm initiating a thread. This thread "data_thread" will collect data from
>> web and writes to channel processor.
>>
>> My data collection thread is some thing like thread starts collecting
>> data from web and when data is over then sleeps for some time and then
>> again fetches data from web, this process is continuous. Here the control
>> will be inside the thread "data_thread" all the time, since I have used
>> while(true) inside thread logic.
>>
>> Here thread "data_thread" is collecting data upto 7mins and then it stops
>> collecting data.
>> I implemented logging and there by I came to know that  thread
>>  "data_thread" state is "BLOCKED". And not returning for ever.
>> In my custom source I tried in both ways by synchronizing all methods and
>> also by non synchronizing all methods but no use.
>>
>> This blocking occurs when I start flume agent from java program, but when
>> I start flume agent from shell it is working fine data is collected
>> continuously for days without any error.
>>
>> Any idea on why thread is going to BLOCKED state??
>>
>> And also I want to know about flume custom source environment
>>
>> Thanking you
>> Venkatesh
>>
>>
>>
>

Re: Flume custom source get BLOCKED automatically after some time

Posted by Jonathan Natkins <na...@streamsets.com>.
Hi Venkatesh,

Does it reliably stop processing events after about 7 minutes, or does it
happen randomly, and just quickly? Does the program immediately start up
the Flume agent?

Have you looked at a thread dump from the program, at all? You can use
`jstack -F <pid>` to produce a stacktrace of all the threads in the
program, which might have some useful information. Can you post a thread
dump of the agent process that is spawned from your program?

Thanks,
Natty


On Tue, Jul 15, 2014 at 8:44 AM, Venkatesh B <ve...@gmail.com>
wrote:

> Hello,
>
> I have written flume custom source and in custom source in start() method
> I'm initiating a thread. This thread "data_thread" will collect data from
> web and writes to channel processor.
>
> My data collection thread is some thing like thread starts collecting data
> from web and when data is over then sleeps for some time and then again
> fetches data from web, this process is continuous. Here the control will be
> inside the thread "data_thread" all the time, since I have used while(true)
> inside thread logic.
>
> Here thread "data_thread" is collecting data upto 7mins and then it stops
> collecting data.
> I implemented logging and there by I came to know that  thread
>  "data_thread" state is "BLOCKED". And not returning for ever.
> In my custom source I tried in both ways by synchronizing all methods and
> also by non synchronizing all methods but no use.
>
> This blocking occurs when I start flume agent from java program, but when
> I start flume agent from shell it is working fine data is collected
> continuously for days without any error.
>
> Any idea on why thread is going to BLOCKED state??
>
> And also I want to know about flume custom source environment
>
> Thanking you
> Venkatesh
>
>
>