You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Mark <st...@gmail.com> on 2011/08/09 16:56:20 UTC

Transforming value

I'm not quite sure if this is possible but is there a way to transform
the message before saving into a collector source. I see that regex can
be used to use dynamic values for the path/filename of the the sink but
I can not find out how to apply this to the value of the message. For
example, we would like to use regex to parse out a certain value from
the message... similar to rsyslog message templates.

Thanks


Re: Transforming value

Posted by NerdyNick <ne...@gmail.com>.
There is the format() decorator that can modify the body of the event
using the attributes collection. Attributes being what you just
extracted via regex or regexall. Do not you will need to do this on
the agent and not the collectors as it could effect acks.

On Tue, Aug 9, 2011 at 12:28 PM, Mark <st...@gmail.com> wrote:
> Ok just wanted to check before writing something custom.
>
> Do you forsee adding such ability? It would sure help those of use who are
> switching from other aggregation frameworks such a rsyslog or syslog-ng
>
> Cheers!
>
> On 8/9/11 11:25 AM, Eric Sammer wrote:
>>
>> Mark:
>>
>> The extractor is just that. In other words, it doesn't mutate the
>> original message, just pulls bits out into attributes. Today, there's
>> no way to change the message (more of a substitute than a simple
>> capturing match) without writing a customer deco. Sorry.
>>
>> On Tue, Aug 9, 2011 at 11:04 AM, Mark<st...@gmail.com>  wrote:
>>>
>>> Thanks I figured I could use one of those decorators but I'm not sure of
>>> the
>>> syntax on how to apply when also using a regex to parse out a value to
>>> use
>>> for bucketing.
>>>
>>> This is what I'm using:
>>>  - exec multiconfig 'localhost: syslogTcp(10514) | { regex("\d+ (\w+)",
>>> 1,
>>> "type") =>  collectorSink("file:///tmp/testing/%{type}", "" ) } ;'
>>>
>>> And I have messages in the following format:
>>>  - "12345 FOO this is my message"
>>>
>>> How can I bucket on "foo" and save the following message "this is my
>>> message"? Right now, bucketing works but the whole string gets saved as
>>> the
>>> message.
>>>
>>> Thanks
>>>
>>>
>>> On 8/9/11 10:38 AM, Eric Sammer wrote:
>>>>
>>>> Mark:
>>>>
>>>> The regex[1] (or regexAll) decorators should do what you want. They
>>>> apply a given regex to the body of an event and put a capture in an
>>>> attribute field. It's primarily for things such as extracting content
>>>> that one can then bucket output by.
>>>>
>>>> If you need to do something fancier than this, you'll have to
>>>> implement a decorator yourself (or file a JIRA to request an
>>>> additional feature of an existing deco).
>>>>
>>>> [1]
>>>>
>>>> http://archive.cloudera.com/cdh/3/flume/UserGuide/index.html#_flume_sink_decorator_catalog
>>>>
>>>> On Tue, Aug 9, 2011 at 10:08 AM, Mark<st...@gmail.com>
>>>>  wrote:
>>>>>
>>>>> Thread working?
>>>>>
>>>>> On 8/9/11 7:56 AM, Mark wrote:
>>>>>>
>>>>>> I'm not quite sure if this is possible but is there a way to transform
>>>>>> the message before saving into a collector source. I see that regex
>>>>>> can
>>>>>> be used to use dynamic values for the path/filename of the the sink
>>>>>> but
>>>>>> I can not find out how to apply this to the value of the message. For
>>>>>> example, we would like to use regex to parse out a certain value from
>>>>>> the message... similar to rsyslog message templates.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>
>>
>>
>



-- 
Nick Verbeck - NerdyNick
----------------------------------------------------
NerdyNick.com
Coloco.ubuntu-rocks.org

Re: Transforming value

Posted by Mark <st...@gmail.com>.
Ok just wanted to check before writing something custom.

Do you forsee adding such ability? It would sure help those of use who 
are switching from other aggregation frameworks such a rsyslog or syslog-ng

Cheers!

On 8/9/11 11:25 AM, Eric Sammer wrote:
> Mark:
>
> The extractor is just that. In other words, it doesn't mutate the
> original message, just pulls bits out into attributes. Today, there's
> no way to change the message (more of a substitute than a simple
> capturing match) without writing a customer deco. Sorry.
>
> On Tue, Aug 9, 2011 at 11:04 AM, Mark<st...@gmail.com>  wrote:
>> Thanks I figured I could use one of those decorators but I'm not sure of the
>> syntax on how to apply when also using a regex to parse out a value to use
>> for bucketing.
>>
>> This is what I'm using:
>>   - exec multiconfig 'localhost: syslogTcp(10514) | { regex("\d+ (\w+)", 1,
>> "type") =>  collectorSink("file:///tmp/testing/%{type}", "" ) } ;'
>>
>> And I have messages in the following format:
>>   - "12345 FOO this is my message"
>>
>> How can I bucket on "foo" and save the following message "this is my
>> message"? Right now, bucketing works but the whole string gets saved as the
>> message.
>>
>> Thanks
>>
>>
>> On 8/9/11 10:38 AM, Eric Sammer wrote:
>>> Mark:
>>>
>>> The regex[1] (or regexAll) decorators should do what you want. They
>>> apply a given regex to the body of an event and put a capture in an
>>> attribute field. It's primarily for things such as extracting content
>>> that one can then bucket output by.
>>>
>>> If you need to do something fancier than this, you'll have to
>>> implement a decorator yourself (or file a JIRA to request an
>>> additional feature of an existing deco).
>>>
>>> [1]
>>> http://archive.cloudera.com/cdh/3/flume/UserGuide/index.html#_flume_sink_decorator_catalog
>>>
>>> On Tue, Aug 9, 2011 at 10:08 AM, Mark<st...@gmail.com>    wrote:
>>>> Thread working?
>>>>
>>>> On 8/9/11 7:56 AM, Mark wrote:
>>>>> I'm not quite sure if this is possible but is there a way to transform
>>>>> the message before saving into a collector source. I see that regex can
>>>>> be used to use dynamic values for the path/filename of the the sink but
>>>>> I can not find out how to apply this to the value of the message. For
>>>>> example, we would like to use regex to parse out a certain value from
>>>>> the message... similar to rsyslog message templates.
>>>>>
>>>>> Thanks
>>>>>
>>>
>
>

Re: Transforming value

Posted by Eric Sammer <es...@cloudera.com>.
Mark:

The extractor is just that. In other words, it doesn't mutate the
original message, just pulls bits out into attributes. Today, there's
no way to change the message (more of a substitute than a simple
capturing match) without writing a customer deco. Sorry.

On Tue, Aug 9, 2011 at 11:04 AM, Mark <st...@gmail.com> wrote:
> Thanks I figured I could use one of those decorators but I'm not sure of the
> syntax on how to apply when also using a regex to parse out a value to use
> for bucketing.
>
> This is what I'm using:
>  - exec multiconfig 'localhost: syslogTcp(10514) | { regex("\d+ (\w+)", 1,
> "type") => collectorSink("file:///tmp/testing/%{type}", "" ) } ;'
>
> And I have messages in the following format:
>  - "12345 FOO this is my message"
>
> How can I bucket on "foo" and save the following message "this is my
> message"? Right now, bucketing works but the whole string gets saved as the
> message.
>
> Thanks
>
>
> On 8/9/11 10:38 AM, Eric Sammer wrote:
>>
>> Mark:
>>
>> The regex[1] (or regexAll) decorators should do what you want. They
>> apply a given regex to the body of an event and put a capture in an
>> attribute field. It's primarily for things such as extracting content
>> that one can then bucket output by.
>>
>> If you need to do something fancier than this, you'll have to
>> implement a decorator yourself (or file a JIRA to request an
>> additional feature of an existing deco).
>>
>> [1]
>> http://archive.cloudera.com/cdh/3/flume/UserGuide/index.html#_flume_sink_decorator_catalog
>>
>> On Tue, Aug 9, 2011 at 10:08 AM, Mark<st...@gmail.com>  wrote:
>>>
>>> Thread working?
>>>
>>> On 8/9/11 7:56 AM, Mark wrote:
>>>>
>>>> I'm not quite sure if this is possible but is there a way to transform
>>>> the message before saving into a collector source. I see that regex can
>>>> be used to use dynamic values for the path/filename of the the sink but
>>>> I can not find out how to apply this to the value of the message. For
>>>> example, we would like to use regex to parse out a certain value from
>>>> the message... similar to rsyslog message templates.
>>>>
>>>> Thanks
>>>>
>>
>>
>



-- 
Eric Sammer
twitter: esammer
data: www.cloudera.com

Re: Transforming value

Posted by Mark <st...@gmail.com>.
Thanks I figured I could use one of those decorators but I'm not sure of 
the syntax on how to apply when also using a regex to parse out a value 
to use for bucketing.

This is what I'm using:
  - exec multiconfig 'localhost: syslogTcp(10514) | { regex("\d+ (\w+)", 
1, "type") => collectorSink("file:///tmp/testing/%{type}", "" ) } ;'

And I have messages in the following format:
  - "12345 FOO this is my message"

How can I bucket on "foo" and save the following message "this is my 
message"? Right now, bucketing works but the whole string gets saved as 
the message.

Thanks


On 8/9/11 10:38 AM, Eric Sammer wrote:
> Mark:
>
> The regex[1] (or regexAll) decorators should do what you want. They
> apply a given regex to the body of an event and put a capture in an
> attribute field. It's primarily for things such as extracting content
> that one can then bucket output by.
>
> If you need to do something fancier than this, you'll have to
> implement a decorator yourself (or file a JIRA to request an
> additional feature of an existing deco).
>
> [1] http://archive.cloudera.com/cdh/3/flume/UserGuide/index.html#_flume_sink_decorator_catalog
>
> On Tue, Aug 9, 2011 at 10:08 AM, Mark<st...@gmail.com>  wrote:
>> Thread working?
>>
>> On 8/9/11 7:56 AM, Mark wrote:
>>> I'm not quite sure if this is possible but is there a way to transform
>>> the message before saving into a collector source. I see that regex can
>>> be used to use dynamic values for the path/filename of the the sink but
>>> I can not find out how to apply this to the value of the message. For
>>> example, we would like to use regex to parse out a certain value from
>>> the message... similar to rsyslog message templates.
>>>
>>> Thanks
>>>
>
>

Re: Transforming value

Posted by Eric Sammer <es...@cloudera.com>.
Mark:

The regex[1] (or regexAll) decorators should do what you want. They
apply a given regex to the body of an event and put a capture in an
attribute field. It's primarily for things such as extracting content
that one can then bucket output by.

If you need to do something fancier than this, you'll have to
implement a decorator yourself (or file a JIRA to request an
additional feature of an existing deco).

[1] http://archive.cloudera.com/cdh/3/flume/UserGuide/index.html#_flume_sink_decorator_catalog

On Tue, Aug 9, 2011 at 10:08 AM, Mark <st...@gmail.com> wrote:
> Thread working?
>
> On 8/9/11 7:56 AM, Mark wrote:
>>
>> I'm not quite sure if this is possible but is there a way to transform
>> the message before saving into a collector source. I see that regex can
>> be used to use dynamic values for the path/filename of the the sink but
>> I can not find out how to apply this to the value of the message. For
>> example, we would like to use regex to parse out a certain value from
>> the message... similar to rsyslog message templates.
>>
>> Thanks
>>
>



-- 
Eric Sammer
twitter: esammer
data: www.cloudera.com

Re: Transforming value

Posted by Mark <st...@gmail.com>.
Thread working?

On 8/9/11 7:56 AM, Mark wrote:
> I'm not quite sure if this is possible but is there a way to transform
> the message before saving into a collector source. I see that regex can
> be used to use dynamic values for the path/filename of the the sink but
> I can not find out how to apply this to the value of the message. For
> example, we would like to use regex to parse out a certain value from
> the message... similar to rsyslog message templates.
>
> Thanks
>