You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Ma...@gdc4s.com on 2013/02/12 20:24:59 UTC

Preventing Data Loss during Restart

I've seen some threads on this online in the past but I can't seem to
find a distinct answer. We're deploying Flume in a production
environment where we're going to be grabbing log data from syslog and
other sources. While Flume supports run time configuration changes we
are still noticing data loss during testing even with a file channel.
Now, this is a single channel, source, and sink set up, no redundancy.
Does anyone know of a clean way to support guaranteed delivery without
redundancy?

 

Thanks!

 

Matt

This message and/or attachments may include information subject to GDC4S
S.P. 1.8.6 and GD Corporate Policy 07-105 and are intended to be
accessed only by authorized recipients.  Use, storage and transmission
are governed by General Dynamics and its policies. Contractual
restrictions apply to third parties.  Recipients should refer to the
policies or contract to determine proper handling.  Unauthorized review,
use, disclosure or distribution is prohibited.  If you are not an
intended recipient, please contact the sender and destroy all copies of
the original message.

 


Re: Preventing Data Loss during Restart

Posted by Friso van Vollenhoven <fv...@xebia.com>.
The link I sent in my previous message was not the correct one. Check http://www.rsyslog.com/doc/relp.html for the RELP reliable syslog protocol that rsyslog has support for. I am currently trying to figure out how stable it is and who uses it in production.

Friso


On 13 feb. 2013, at 07:47, Friso van Vollenhoven wrote:

If you are using rsyslog as your syslog daemon (the default in CentOS and RHEL), then there is such a thing as reliable TCP transport built in (http://www.rsyslog.com/doc/rsyslog_reliable_forwarding.html). Flume doesn't have a source for this, but it would be a nice feature to build it. I thought about this before. We also use syslog for sending logs to Flume and are also not keen on running Java front-end boxes, so we have the same problem.

I will try to have a look at their protocol today, to see how complex it would be (not making any promises).


Cheers,
Friso



On 12 feb. 2013, at 21:19, <Ma...@gdc4s.com>>
 <Ma...@gdc4s.com>> wrote:

Yeah I’m starting to answer my own question. We’re using 1.3 so we do have Avro. We were trying to avoid installing anything on our client (Source) machines so that we could avoid installing Java on machines we didn’t need it on.

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Tuesday, February 12, 2013 2:42 PM
To: user@flume.apache.org<ma...@flume.apache.org>
Subject: Re: Preventing Data Loss during Restart

Hi,

What version of Flume are you using? Also note that Syslog is a fire and forget protocol, so when you reconfigure, any events not persisted to the file channel would be lost. Since there is no way of informing the data source that the data was not written to disk, this data could in fact be lost. We recommend using a source which actually does report failure, like Avro/Thrift (available on trunk, not in any release yet) or HTTP. This will allow you to retry if Flume reports failure.


Hari

--
Hari Shreedharan


On Tuesday, February 12, 2013 at 11:24 AM, Matt.Elliott@gdc4s.com<ma...@gdc4s.com> wrote:

I’ve seen some threads on this online in the past but I can’t seem to find a distinct answer. We’re deploying Flume in a production environment where we’re going to be grabbing log data from syslog and other sources. While Flume supports run time configuration changes we are still noticing data loss during testing even with a file channel. Now, this is a single channel, source, and sink set up, no redundancy. Does anyone know of a clean way to support guaranteed delivery without redundancy?

Thanks!

Matt
This message and/or attachments may include information subject to GDC4S S.P. 1.8.6 and GD Corporate Policy 07-105 and are intended to be accessed only by authorized recipients.  Use, storage and transmission are governed by General Dynamics and its policies. Contractual restrictions apply to third parties.  Recipients should refer to the policies or contract to determine proper handling.  Unauthorized review, use, disclosure or distribution is prohibited.  If you are not an intended recipient, please contact the sender and destroy all copies of the original message.





Re: Preventing Data Loss during Restart

Posted by Friso van Vollenhoven <fv...@xebia.com>.
If you are using rsyslog as your syslog daemon (the default in CentOS and RHEL), then there is such a thing as reliable TCP transport built in (http://www.rsyslog.com/doc/rsyslog_reliable_forwarding.html). Flume doesn't have a source for this, but it would be a nice feature to build it. I thought about this before. We also use syslog for sending logs to Flume and are also not keen on running Java front-end boxes, so we have the same problem.

I will try to have a look at their protocol today, to see how complex it would be (not making any promises).


Cheers,
Friso



On 12 feb. 2013, at 21:19, <Ma...@gdc4s.com>>
 <Ma...@gdc4s.com>> wrote:

Yeah I’m starting to answer my own question. We’re using 1.3 so we do have Avro. We were trying to avoid installing anything on our client (Source) machines so that we could avoid installing Java on machines we didn’t need it on.

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com]
Sent: Tuesday, February 12, 2013 2:42 PM
To: user@flume.apache.org<ma...@flume.apache.org>
Subject: Re: Preventing Data Loss during Restart

Hi,

What version of Flume are you using? Also note that Syslog is a fire and forget protocol, so when you reconfigure, any events not persisted to the file channel would be lost. Since there is no way of informing the data source that the data was not written to disk, this data could in fact be lost. We recommend using a source which actually does report failure, like Avro/Thrift (available on trunk, not in any release yet) or HTTP. This will allow you to retry if Flume reports failure.


Hari

--
Hari Shreedharan


On Tuesday, February 12, 2013 at 11:24 AM, Matt.Elliott@gdc4s.com<ma...@gdc4s.com> wrote:

I’ve seen some threads on this online in the past but I can’t seem to find a distinct answer. We’re deploying Flume in a production environment where we’re going to be grabbing log data from syslog and other sources. While Flume supports run time configuration changes we are still noticing data loss during testing even with a file channel. Now, this is a single channel, source, and sink set up, no redundancy. Does anyone know of a clean way to support guaranteed delivery without redundancy?

Thanks!

Matt
This message and/or attachments may include information subject to GDC4S S.P. 1.8.6 and GD Corporate Policy 07-105 and are intended to be accessed only by authorized recipients.  Use, storage and transmission are governed by General Dynamics and its policies. Contractual restrictions apply to third parties.  Recipients should refer to the policies or contract to determine proper handling.  Unauthorized review, use, disclosure or distribution is prohibited.  If you are not an intended recipient, please contact the sender and destroy all copies of the original message.




RE: Preventing Data Loss during Restart

Posted by Ma...@gdc4s.com.
Yeah I’m starting to answer my own question. We’re using 1.3 so we do have Avro. We were trying to avoid installing anything on our client (Source) machines so that we could avoid installing Java on machines we didn’t need it on.

 

From: Hari Shreedharan [mailto:hshreedharan@cloudera.com] 
Sent: Tuesday, February 12, 2013 2:42 PM
To: user@flume.apache.org
Subject: Re: Preventing Data Loss during Restart

 

Hi, 

 

What version of Flume are you using? Also note that Syslog is a fire and forget protocol, so when you reconfigure, any events not persisted to the file channel would be lost. Since there is no way of informing the data source that the data was not written to disk, this data could in fact be lost. We recommend using a source which actually does report failure, like Avro/Thrift (available on trunk, not in any release yet) or HTTP. This will allow you to retry if Flume reports failure.

 

 

Hari

 

-- 

Hari Shreedharan

 

On Tuesday, February 12, 2013 at 11:24 AM, Matt.Elliott@gdc4s.com wrote:

	I’ve seen some threads on this online in the past but I can’t seem to find a distinct answer. We’re deploying Flume in a production environment where we’re going to be grabbing log data from syslog and other sources. While Flume supports run time configuration changes we are still noticing data loss during testing even with a file channel. Now, this is a single channel, source, and sink set up, no redundancy. Does anyone know of a clean way to support guaranteed delivery without redundancy?

	 

	Thanks!

	 

	Matt

	This message and/or attachments may include information subject to GDC4S S.P. 1.8.6 and GD Corporate Policy 07-105 and are intended to be accessed only by authorized recipients.  Use, storage and transmission are governed by General Dynamics and its policies. Contractual restrictions apply to third parties.  Recipients should refer to the policies or contract to determine proper handling.  Unauthorized review, use, disclosure or distribution is prohibited.  If you are not an intended recipient, please contact the sender and destroy all copies of the original message.

	 

 


Re: Preventing Data Loss during Restart

Posted by Hari Shreedharan <hs...@cloudera.com>.
Hi,  

What version of Flume are you using? Also note that Syslog is a fire and forget protocol, so when you reconfigure, any events not persisted to the file channel would be lost. Since there is no way of informing the data source that the data was not written to disk, this data could in fact be lost. We recommend using a source which actually does report failure, like Avro/Thrift (available on trunk, not in any release yet) or HTTP. This will allow you to retry if Flume reports failure.


Hari  

--  
Hari Shreedharan


On Tuesday, February 12, 2013 at 11:24 AM, Matt.Elliott@gdc4s.com wrote:

> I’ve seen some threads on this online in the past but I can’t seem to find a distinct answer. We’re deploying Flume in a production environment where we’re going to be grabbing log data from syslog and other sources. While Flume supports run time configuration changes we are still noticing data loss during testing even with a file channel. Now, this is a single channel, source, and sink set up, no redundancy. Does anyone know of a clean way to support guaranteed delivery without redundancy?
>   
> Thanks!
>   
> Matt
> This message and/or attachments may include information subject to GDC4S S.P. 1.8.6 and GD Corporate Policy 07-105 and are intended to be accessed only by authorized recipients.  Use, storage and transmission are governed by General Dynamics and its policies. Contractual restrictions apply to third parties.  Recipients should refer to the policies or contract to determine proper handling.  Unauthorized review, use, disclosure or distribution is prohibited.  If you are not an intended recipient, please contact the sender and destroy all copies of the original message.
>   
>  
>  
>