You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by "Edward Sargisson (JIRA)" <ji...@apache.org> on 2014/05/23 17:43:03 UTC

[jira] [Commented] (FLUME-2390) Flume-ElasticSearch Data gets posted multiple times when one of the event fail validation at elastic search sink for JSON Data

    [ https://issues.apache.org/jira/browse/FLUME-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14007259#comment-14007259 ] 

Edward Sargisson commented on FLUME-2390:
-----------------------------------------

Yes - but what would we do about that?
Out attitude with Flume is to not lose data but complain bitterly and fill up the queue. We don't have a dead letter queue feature. So, yes, if you poison pill your queue it's going to sit there and keep retrying.

I'm inclined to close this work item as Working as Designed but feel free to argue for a better approach.

> Flume-ElasticSearch Data gets posted multiple times when one of the event fail validation at elastic search sink for JSON Data
> ------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLUME-2390
>                 URL: https://issues.apache.org/jira/browse/FLUME-2390
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.4.0
>         Environment: CDH4.5
>            Reporter: Deepak Subhramanian
>
> Hi,
> I am using Elastic Search Sink to post JSON data. I used the temporary fix mentioned in https://issues.apache.org/jira/browse/FLUME-2126 to get JSON data posted to elastic search. When one of the message fail validation at ElasticSearch mapping for JSON data ( For example - getting empty message) , Flume seems to post the entire batch again and again until I restart Flume.  Because of that no of events went from an avg of 100 to avg of 2000 per 10 minutes. As a temporary fix I set a header in my FlumeHTTP Source for non valid JSON and used a interceptor to send data to multiple ESSINKS which has different index names. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)