You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by Nikolaos Tsipas <Ni...@bbc.co.uk> on 2014/02/13 18:58:13 UTC

Event header validation using interceptors

Hello,

We have a use case that requires the validation of headers on events received by an avro source in order to consider an event as valid or invalid. If an event is invalid then it should be routed to a different channel.

We know how to route events based on the values of specific headers using multiplexing. However, for the regex validation of headers flume doesn't seem to provide any appropriate interceptors.

For this reason, we are thinking to create a new interceptor that would allow regex validation of headers and depending on the outcome a header would be added (e.g. valid = true)

Questions:

* Does the above sound like a reasonable solution for what we want to achieve?
* What would be the best way to implement it in order to be beneficial for the flume community? Extend the functionality of one of the existing interceptors (e.g. RegexFilteringInterceptor) or provide a new one?

Regards,
Nikolaos



----------------------------

http://www.bbc.co.uk
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

---------------------

RE: Event header validation using interceptors

Posted by Nikolaos Tsipas <Ni...@bbc.co.uk>.
Thanks for your suggestions. We came across the morphline interceptor before but looked a little bit over complicated for what we wanted to do at that point. 

However might be perfect for what we want to do now so we will give it a go.

Regards,
Nikolaos
________________________________________
From: Wolfgang Hoschek [whoschek@cloudera.com]
Sent: Thursday, February 13, 2014 8:31 PM
To: dev@flume.apache.org
Subject: Re: Event header validation using interceptors

The morphline interceptor puts all flume events headers plus the flume event body into the input morphline record, so morphline commands can match on the entire flume event.

Wolfgang.

On Feb 13, 2014, at 9:06 PM, Jeff Lord wrote:

> Wolfgang,
>
> Will the morphline interceptor + grok actually match event headers or
> just the event body?
>
> -Jeff
>
> On Thu, Feb 13, 2014 at 10:05 AM, Wolfgang Hoschek
> <wh...@cloudera.com> wrote:
>> You could probably do this with a MorphlineInterceptor, e.g. via using the grok command in combination with the tryCatch command.
>>
>> http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor
>> http://kitesdk.org/docs/current/kite-morphlines/index.html
>> http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#grok
>> http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#tryRules
>>
>> Wolfgang.
>>
>> On Feb 13, 2014, at 7:58 PM, Nikolaos Tsipas wrote:
>>
>>> Hello,
>>>
>>> We have a use case that requires the validation of headers on events received by an avro source in order to consider an event as valid or invalid. If an event is invalid then it should be routed to a different channel.
>>>
>>> We know how to route events based on the values of specific headers using multiplexing. However, for the regex validation of headers flume doesn't seem to provide any appropriate interceptors.
>>>
>>> For this reason, we are thinking to create a new interceptor that would allow regex validation of headers and depending on the outcome a header would be added (e.g. valid = true)
>>>
>>> Questions:
>>>
>>> * Does the above sound like a reasonable solution for what we want to achieve?
>>> * What would be the best way to implement it in order to be beneficial for the flume community? Extend the functionality of one of the existing interceptors (e.g. RegexFilteringInterceptor) or provide a new one?
>>>
>>> Regards,
>>> Nikolaos
>>>
>>>
>>>
>>> ----------------------------
>>>
>>> http://www.bbc.co.uk
>>> This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
>>> If you have received it in error, please delete it from your system.
>>> Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
>>> Please note that the BBC monitors e-mails sent or received.
>>> Further communication will signify your consent to this.
>>>
>>> ---------------------
>>


Re: Event header validation using interceptors

Posted by Wolfgang Hoschek <wh...@cloudera.com>.
The morphline interceptor puts all flume events headers plus the flume event body into the input morphline record, so morphline commands can match on the entire flume event.

Wolfgang.

On Feb 13, 2014, at 9:06 PM, Jeff Lord wrote:

> Wolfgang,
> 
> Will the morphline interceptor + grok actually match event headers or
> just the event body?
> 
> -Jeff
> 
> On Thu, Feb 13, 2014 at 10:05 AM, Wolfgang Hoschek
> <wh...@cloudera.com> wrote:
>> You could probably do this with a MorphlineInterceptor, e.g. via using the grok command in combination with the tryCatch command.
>> 
>> http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor
>> http://kitesdk.org/docs/current/kite-morphlines/index.html
>> http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#grok
>> http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#tryRules
>> 
>> Wolfgang.
>> 
>> On Feb 13, 2014, at 7:58 PM, Nikolaos Tsipas wrote:
>> 
>>> Hello,
>>> 
>>> We have a use case that requires the validation of headers on events received by an avro source in order to consider an event as valid or invalid. If an event is invalid then it should be routed to a different channel.
>>> 
>>> We know how to route events based on the values of specific headers using multiplexing. However, for the regex validation of headers flume doesn't seem to provide any appropriate interceptors.
>>> 
>>> For this reason, we are thinking to create a new interceptor that would allow regex validation of headers and depending on the outcome a header would be added (e.g. valid = true)
>>> 
>>> Questions:
>>> 
>>> * Does the above sound like a reasonable solution for what we want to achieve?
>>> * What would be the best way to implement it in order to be beneficial for the flume community? Extend the functionality of one of the existing interceptors (e.g. RegexFilteringInterceptor) or provide a new one?
>>> 
>>> Regards,
>>> Nikolaos
>>> 
>>> 
>>> 
>>> ----------------------------
>>> 
>>> http://www.bbc.co.uk
>>> This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
>>> If you have received it in error, please delete it from your system.
>>> Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
>>> Please note that the BBC monitors e-mails sent or received.
>>> Further communication will signify your consent to this.
>>> 
>>> ---------------------
>> 


Re: Event header validation using interceptors

Posted by Jeff Lord <jl...@cloudera.com>.
Wolfgang,

Will the morphline interceptor + grok actually match event headers or
just the event body?

-Jeff

On Thu, Feb 13, 2014 at 10:05 AM, Wolfgang Hoschek
<wh...@cloudera.com> wrote:
> You could probably do this with a MorphlineInterceptor, e.g. via using the grok command in combination with the tryCatch command.
>
> http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor
> http://kitesdk.org/docs/current/kite-morphlines/index.html
> http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#grok
> http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#tryRules
>
> Wolfgang.
>
> On Feb 13, 2014, at 7:58 PM, Nikolaos Tsipas wrote:
>
>> Hello,
>>
>> We have a use case that requires the validation of headers on events received by an avro source in order to consider an event as valid or invalid. If an event is invalid then it should be routed to a different channel.
>>
>> We know how to route events based on the values of specific headers using multiplexing. However, for the regex validation of headers flume doesn't seem to provide any appropriate interceptors.
>>
>> For this reason, we are thinking to create a new interceptor that would allow regex validation of headers and depending on the outcome a header would be added (e.g. valid = true)
>>
>> Questions:
>>
>> * Does the above sound like a reasonable solution for what we want to achieve?
>> * What would be the best way to implement it in order to be beneficial for the flume community? Extend the functionality of one of the existing interceptors (e.g. RegexFilteringInterceptor) or provide a new one?
>>
>> Regards,
>> Nikolaos
>>
>>
>>
>> ----------------------------
>>
>> http://www.bbc.co.uk
>> This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
>> If you have received it in error, please delete it from your system.
>> Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
>> Please note that the BBC monitors e-mails sent or received.
>> Further communication will signify your consent to this.
>>
>> ---------------------
>

Re: Event header validation using interceptors

Posted by Wolfgang Hoschek <wh...@cloudera.com>.
You could probably do this with a MorphlineInterceptor, e.g. via using the grok command in combination with the tryCatch command. 

http://flume.apache.org/FlumeUserGuide.html#morphline-interceptor
http://kitesdk.org/docs/current/kite-morphlines/index.html
http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#grok
http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#tryRules

Wolfgang.

On Feb 13, 2014, at 7:58 PM, Nikolaos Tsipas wrote:

> Hello,
> 
> We have a use case that requires the validation of headers on events received by an avro source in order to consider an event as valid or invalid. If an event is invalid then it should be routed to a different channel.
> 
> We know how to route events based on the values of specific headers using multiplexing. However, for the regex validation of headers flume doesn't seem to provide any appropriate interceptors.
> 
> For this reason, we are thinking to create a new interceptor that would allow regex validation of headers and depending on the outcome a header would be added (e.g. valid = true)
> 
> Questions:
> 
> * Does the above sound like a reasonable solution for what we want to achieve?
> * What would be the best way to implement it in order to be beneficial for the flume community? Extend the functionality of one of the existing interceptors (e.g. RegexFilteringInterceptor) or provide a new one?
> 
> Regards,
> Nikolaos
> 
> 
> 
> ----------------------------
> 
> http://www.bbc.co.uk
> This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
> If you have received it in error, please delete it from your system.
> Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
> Please note that the BBC monitors e-mails sent or received.
> Further communication will signify your consent to this.
> 
> ---------------------