You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by David Sinclair <ds...@chariotsolutions.com> on 2013/10/01 17:04:03 UTC

HDFS Sink Question

Hi all,

I have created an AMQP Source that is being used to feed an HDFS Sink.
Everything is working as expected, but I wanted to try out some error
scenarios.

After creating a file in HDFS and starting to write to it I shutdown HDFS.
I saw the errors in the log as I would expect, and after the configured
roll time tried to close the file. Since HDFS wasn't running it wasn't able
to do so. I restarted HDFS in hope that it would try the close again but it
did not.

Can someone tell me expected behavior under the following scenarios?


   - HDFS isn't available before ever trying to create/write to a file
   - HDFS becomes unavailable after already creating a file and starting to
   write to it
   - HDFS is unavailable when trying to close a file

I'd also be happy to contribute the AMQP source. I wrote the old version
for the original flume

https://github.com/stampy88/flume-amqp-plugin/

Let me know if you'd be interested and thanks for the answers.

dave

Re: HDFS Sink Question

Posted by Hari Shreedharan <hs...@cloudera.com>.
Can you put these issues you found on a jira here: https://issues.apache.org/jira/browse/FLUME. If this is a real issue, we should fix it. Ideally the sink should reconnect to a broken HDFS, but probably only after the initial connection. I am not sure what happens if the HDFS connection fails.
 



Thanks,
Hari


On Friday, October 4, 2013 at 8:09 AM, DSuiter RDX wrote:

> I can see that being an issue - hopefully your HDFS never hiccups, but if it does, or if you need to stop it, it seems like restarting the agent is the only way to recover...
> 
> As a workaround, you may be able to set up a file channel, and then maybe some kind of trigger script to restart them if the HDFS service bounces? Just throwing spaghetti there... 
> 
> Have you explored Kafka as an alternative? I haven't gone deeply into it, but I know some people have found it to be better for their design than Flume.
> 
> Well, hopefully you get the answers you need. If you rewrite the HDFS sink with this built-in, I'm sure the project will be interested! 
> 
> Devin Suiter
> Jr. Data Solutions Software Engineer
> 
> 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
> Google Voice: 412-256-8556 | www.rdx.com (http://www.rdx.com/) 
> 
> On Fri, Oct 4, 2013 at 10:42 AM, David Sinclair <dsinclair@chariotsolutions.com (mailto:dsinclair@chariotsolutions.com)> wrote:
> > Thanks Devin. I have looked at the source I can absolutely say for certain that the connection is never re-established because there is no code that detects that type of error. 
> > 
> > What I was looking for from the devs was confirmation on my findings and any work arounds besides writing my own HDFS Sink. 
> > 
> > Not having this recovery gracefully is a pain and may prevent us from using flume. 
> > 
> > 
> > On Fri, Oct 4, 2013 at 9:21 AM, DSuiter RDX <dsuiter@rdx.com (mailto:dsuiter@rdx.com)> wrote:
> > > David,
> > > 
> > > In experimenting with the file_roll sink for local logging, I noticed that the file it wrote to was created when the agent starts. If you start the agent, then remove the file, and attempt to write, there is no new file created. Perhaps HDFS sink is similar, in that when the sink starts, the destination is established, and then if that file chain is broken, Flume cannot gracefully detect and correct that. It may have something to do with how the sink is looking for the target? I'm not a developer for Flume, but, that is my observed behavior on file roll. I am working through kinks in hdfs sink with remote TCP logging from rsyslog right now...maybe I will have some more insight for you in a few days... 
> > > 
> > > Devin Suiter
> > > Jr. Data Solutions Software Engineer
> > > 
> > > 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
> > > Google Voice: 412-256-8556 (tel:412-256-8556) | www.rdx.com (http://www.rdx.com/) 
> > > 
> > > 
> > > On Fri, Oct 4, 2013 at 9:08 AM, David Sinclair <dsinclair@chariotsolutions.com (mailto:dsinclair@chariotsolutions.com)> wrote:
> > > > Anyone?
> > > > 
> > > > This is what I am seeing for the scenarios I asked, but wanted confirmation from devs on expected behavior. 
> > > > HDFS isn't available before ever trying to create/write to a file  - continually tries to create the file and finally succeeds when the cluster is available. 
> > > > HDFS becomes unavailable after already creating a file and starting to write to it - the writer looses the connection, but even after the cluster is available again it never re-establishes a connect. Data loss occurs since it never recovers
> > > > HDFS is unavailable when trying to close a file - suffers from same problems as above
> > > > 
> > > > 
> > > > 
> > > > 
> > > > On Tue, Oct 1, 2013 at 11:04 AM, David Sinclair <dsinclair@chariotsolutions.com (mailto:dsinclair@chariotsolutions.com)> wrote:
> > > > > Hi all,
> > > > > 
> > > > > I have created an AMQP Source that is being used to feed an HDFS Sink. Everything is working as expected, but I wanted to try out some error scenarios.  
> > > > > 
> > > > > After creating a file in HDFS and starting to write to it I shutdown HDFS. I saw the errors in the log as I would expect, and after the configured roll time tried to close the file. Since HDFS wasn't running it wasn't able to do so. I restarted HDFS in hope that it would try the close again but it did not.  
> > > > > 
> > > > > Can someone tell me expected behavior under the following scenarios?
> > > > > 
> > > > > HDFS isn't available before ever trying to create/write to a file
> > > > > HDFS becomes unavailable after already creating a file and starting to write to it
> > > > > HDFS is unavailable when trying to close a file
> > > > > 
> > > > > I'd also be happy to contribute the AMQP source. I wrote the old version for the original flume
> > > > > 
> > > > > 
> > > > > https://github.com/stampy88/flume-amqp-plugin/
> > > > > 
> > > > > Let me know if you'd be interested and thanks for the answers.
> > > > > 
> > > > > dave 
> > > 
> > 
> 


Re: HDFS Sink Question

Posted by DSuiter RDX <ds...@rdx.com>.
I can see that being an issue - hopefully your HDFS never hiccups, but if
it does, or if you need to stop it, it seems like restarting the agent is
the only way to recover...

As a workaround, you may be able to set up a file channel, and then maybe
some kind of trigger script to restart them if the HDFS service bounces?
Just throwing spaghetti there...

Have you explored Kafka as an alternative? I haven't gone deeply into it,
but I know some people have found it to be better for their design than
Flume.

Well, hopefully you get the answers you need. If you rewrite the HDFS sink
with this built-in, I'm sure the project will be interested!

*Devin Suiter*
Jr. Data Solutions Software Engineer
100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Fri, Oct 4, 2013 at 10:42 AM, David Sinclair <
dsinclair@chariotsolutions.com> wrote:

> Thanks Devin. I have looked at the source I can absolutely say for certain
> that the connection is never re-established because there is no code that
> detects that type of error.
>
> What I was looking for from the devs was confirmation on my findings and
> any work arounds besides writing my own HDFS Sink.
>
> Not having this recovery gracefully is a pain and may prevent us from
> using flume.
>
>
> On Fri, Oct 4, 2013 at 9:21 AM, DSuiter RDX <ds...@rdx.com> wrote:
>
>> David,
>>
>> In experimenting with the file_roll sink for local logging, I noticed
>> that the file it wrote to was created when the agent starts. If you start
>> the agent, then remove the file, and attempt to write, there is no new file
>> created. Perhaps HDFS sink is similar, in that when the sink starts, the
>> destination is established, and then if that file chain is broken, Flume
>> cannot gracefully detect and correct that. It may have something to do with
>> how the sink is looking for the target? I'm not a developer for Flume, but,
>> that is my observed behavior on file roll. I am working through kinks in
>> hdfs sink with remote TCP logging from rsyslog right now...maybe I will
>> have some more insight for you in a few days...
>>
>> *Devin Suiter*
>> Jr. Data Solutions Software Engineer
>> 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
>> Google Voice: 412-256-8556 | www.rdx.com
>>
>>
>> On Fri, Oct 4, 2013 at 9:08 AM, David Sinclair <
>> dsinclair@chariotsolutions.com> wrote:
>>
>>> Anyone?
>>>
>>> This is what I am seeing for the scenarios I asked, but wanted
>>> confirmation from devs on expected behavior.
>>>
>>>    - HDFS isn't available before ever trying to create/write to a file
>>>     -* continually tries to create the file and finally succeeds when
>>>    the cluster is available. *
>>>    - HDFS becomes unavailable after already creating a file and
>>>    starting to write to it - *the writer looses the connection, but
>>>    even after the cluster is available again it never re-establishes a
>>>    connect. Data loss occurs since it never recovers*
>>>    - HDFS is unavailable when trying to close a file -* suffers from
>>>    same problems as above*
>>>
>>>
>>>
>>>
>>> On Tue, Oct 1, 2013 at 11:04 AM, David Sinclair <
>>> dsinclair@chariotsolutions.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I have created an AMQP Source that is being used to feed an HDFS Sink.
>>>> Everything is working as expected, but I wanted to try out some error
>>>> scenarios.
>>>>
>>>> After creating a file in HDFS and starting to write to it I shutdown
>>>> HDFS. I saw the errors in the log as I would expect, and after the
>>>> configured roll time tried to close the file. Since HDFS wasn't running it
>>>> wasn't able to do so. I restarted HDFS in hope that it would try the close
>>>> again but it did not.
>>>>
>>>> Can someone tell me expected behavior under the following scenarios?
>>>>
>>>>
>>>>    - HDFS isn't available before ever trying to create/write to a file
>>>>    - HDFS becomes unavailable after already creating a file and
>>>>    starting to write to it
>>>>    - HDFS is unavailable when trying to close a file
>>>>
>>>> I'd also be happy to contribute the AMQP source. I wrote the old
>>>> version for the original flume
>>>>
>>>> https://github.com/stampy88/flume-amqp-plugin/
>>>>
>>>> Let me know if you'd be interested and thanks for the answers.
>>>>
>>>> dave
>>>>
>>>
>>>
>>
>

Re: HDFS Sink Question

Posted by David Sinclair <ds...@chariotsolutions.com>.
Thanks Devin. I have looked at the source I can absolutely say for certain
that the connection is never re-established because there is no code that
detects that type of error.

What I was looking for from the devs was confirmation on my findings and
any work arounds besides writing my own HDFS Sink.

Not having this recovery gracefully is a pain and may prevent us from using
flume.


On Fri, Oct 4, 2013 at 9:21 AM, DSuiter RDX <ds...@rdx.com> wrote:

> David,
>
> In experimenting with the file_roll sink for local logging, I noticed that
> the file it wrote to was created when the agent starts. If you start the
> agent, then remove the file, and attempt to write, there is no new file
> created. Perhaps HDFS sink is similar, in that when the sink starts, the
> destination is established, and then if that file chain is broken, Flume
> cannot gracefully detect and correct that. It may have something to do with
> how the sink is looking for the target? I'm not a developer for Flume, but,
> that is my observed behavior on file roll. I am working through kinks in
> hdfs sink with remote TCP logging from rsyslog right now...maybe I will
> have some more insight for you in a few days...
>
> *Devin Suiter*
> Jr. Data Solutions Software Engineer
> 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
> Google Voice: 412-256-8556 | www.rdx.com
>
>
> On Fri, Oct 4, 2013 at 9:08 AM, David Sinclair <
> dsinclair@chariotsolutions.com> wrote:
>
>> Anyone?
>>
>> This is what I am seeing for the scenarios I asked, but wanted
>> confirmation from devs on expected behavior.
>>
>>    - HDFS isn't available before ever trying to create/write to a file  -
>>    * continually tries to create the file and finally succeeds when the
>>    cluster is available. *
>>    - HDFS becomes unavailable after already creating a file and starting
>>    to write to it - *the writer looses the connection, but even after
>>    the cluster is available again it never re-establishes a connect. Data loss
>>    occurs since it never recovers*
>>    - HDFS is unavailable when trying to close a file -* suffers from
>>    same problems as above*
>>
>>
>>
>>
>> On Tue, Oct 1, 2013 at 11:04 AM, David Sinclair <
>> dsinclair@chariotsolutions.com> wrote:
>>
>>> Hi all,
>>>
>>> I have created an AMQP Source that is being used to feed an HDFS Sink.
>>> Everything is working as expected, but I wanted to try out some error
>>> scenarios.
>>>
>>> After creating a file in HDFS and starting to write to it I shutdown
>>> HDFS. I saw the errors in the log as I would expect, and after the
>>> configured roll time tried to close the file. Since HDFS wasn't running it
>>> wasn't able to do so. I restarted HDFS in hope that it would try the close
>>> again but it did not.
>>>
>>> Can someone tell me expected behavior under the following scenarios?
>>>
>>>
>>>    - HDFS isn't available before ever trying to create/write to a file
>>>    - HDFS becomes unavailable after already creating a file and
>>>    starting to write to it
>>>    - HDFS is unavailable when trying to close a file
>>>
>>> I'd also be happy to contribute the AMQP source. I wrote the old version
>>> for the original flume
>>>
>>> https://github.com/stampy88/flume-amqp-plugin/
>>>
>>> Let me know if you'd be interested and thanks for the answers.
>>>
>>> dave
>>>
>>
>>
>

Re: HDFS Sink Question

Posted by DSuiter RDX <ds...@rdx.com>.
David,

In experimenting with the file_roll sink for local logging, I noticed that
the file it wrote to was created when the agent starts. If you start the
agent, then remove the file, and attempt to write, there is no new file
created. Perhaps HDFS sink is similar, in that when the sink starts, the
destination is established, and then if that file chain is broken, Flume
cannot gracefully detect and correct that. It may have something to do with
how the sink is looking for the target? I'm not a developer for Flume, but,
that is my observed behavior on file roll. I am working through kinks in
hdfs sink with remote TCP logging from rsyslog right now...maybe I will
have some more insight for you in a few days...

*Devin Suiter*
Jr. Data Solutions Software Engineer
100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Fri, Oct 4, 2013 at 9:08 AM, David Sinclair <
dsinclair@chariotsolutions.com> wrote:

> Anyone?
>
> This is what I am seeing for the scenarios I asked, but wanted
> confirmation from devs on expected behavior.
>
>    - HDFS isn't available before ever trying to create/write to a file  -*continually tries to create the file and finally succeeds when the cluster
>    is available. *
>    - HDFS becomes unavailable after already creating a file and starting
>    to write to it - *the writer looses the connection, but even after the
>    cluster is available again it never re-establishes a connect. Data loss
>    occurs since it never recovers*
>    - HDFS is unavailable when trying to close a file -* suffers from same
>    problems as above*
>
>
>
>
> On Tue, Oct 1, 2013 at 11:04 AM, David Sinclair <
> dsinclair@chariotsolutions.com> wrote:
>
>> Hi all,
>>
>> I have created an AMQP Source that is being used to feed an HDFS Sink.
>> Everything is working as expected, but I wanted to try out some error
>> scenarios.
>>
>> After creating a file in HDFS and starting to write to it I shutdown
>> HDFS. I saw the errors in the log as I would expect, and after the
>> configured roll time tried to close the file. Since HDFS wasn't running it
>> wasn't able to do so. I restarted HDFS in hope that it would try the close
>> again but it did not.
>>
>> Can someone tell me expected behavior under the following scenarios?
>>
>>
>>    - HDFS isn't available before ever trying to create/write to a file
>>    - HDFS becomes unavailable after already creating a file and starting
>>    to write to it
>>    - HDFS is unavailable when trying to close a file
>>
>> I'd also be happy to contribute the AMQP source. I wrote the old version
>> for the original flume
>>
>> https://github.com/stampy88/flume-amqp-plugin/
>>
>> Let me know if you'd be interested and thanks for the answers.
>>
>> dave
>>
>
>

Re: HDFS Sink Question

Posted by David Sinclair <ds...@chariotsolutions.com>.
Anyone?

This is what I am seeing for the scenarios I asked, but wanted confirmation
from devs on expected behavior.

   - HDFS isn't available before ever trying to create/write to a file
 -*continually tries to create the file and finally succeeds when the
cluster
   is available. *
   - HDFS becomes unavailable after already creating a file and starting to
   write to it - *the writer looses the connection, but even after the
   cluster is available again it never re-establishes a connect. Data loss
   occurs since it never recovers*
   - HDFS is unavailable when trying to close a file -* suffers from same
   problems as above*




On Tue, Oct 1, 2013 at 11:04 AM, David Sinclair <
dsinclair@chariotsolutions.com> wrote:

> Hi all,
>
> I have created an AMQP Source that is being used to feed an HDFS Sink.
> Everything is working as expected, but I wanted to try out some error
> scenarios.
>
> After creating a file in HDFS and starting to write to it I shutdown HDFS.
> I saw the errors in the log as I would expect, and after the configured
> roll time tried to close the file. Since HDFS wasn't running it wasn't able
> to do so. I restarted HDFS in hope that it would try the close again but it
> did not.
>
> Can someone tell me expected behavior under the following scenarios?
>
>
>    - HDFS isn't available before ever trying to create/write to a file
>    - HDFS becomes unavailable after already creating a file and starting
>    to write to it
>    - HDFS is unavailable when trying to close a file
>
> I'd also be happy to contribute the AMQP source. I wrote the old version
> for the original flume
>
> https://github.com/stampy88/flume-amqp-plugin/
>
> Let me know if you'd be interested and thanks for the answers.
>
> dave
>