You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Jeroen Jacobs <je...@headincloud.be> on 2016/02/01 01:49:10 UTC
Struggling with ExtractText
Hi,
I'm looking at the documentation for the ReplaceText processor, and I'm at a total loss on how I should get started. I have a flowfile with the following content:
/opsbot url logstash
I want to end up with the following attributes:
message.destination: opsbot
message.action: url
message.parameters: logstash
(an action can have multiple parameters. If that's the case, they should all end up in message.parameters).
How do I get started on this? Do I need to create 3 different attributes, each with their own regex? This doesn't seem efficient. Or can I do this in one go? If so, can someone explain me how, and what the correct regex should be?
A few examples on the documentation page could make this a lot more understandable to people who are new to NiFi.
Kind regards,
Jeroen
Re: Struggling with ExtractText
Posted by Matthew Burgess <ma...@gmail.com>.
Jeroen,
You can achieve this with either (as you mentioned) 3 different attributes (each with their own regex), or you can have a single regex and use the grouping/index that ExtractText provides.
For the latter, add a property called “line” (for example) that uses something like the following regex:
/(\S+) (\S+) (\S+)
This splits the line into the three attributes you mention, except the attribute names are based on the group index so:
line.1 = opsbot
line.2 = url
line.3 = logstash
If you need the attributes named as you mention, you can add an UpdateAttribute processor that loads the values above into the attributes you wish:
message.destination = ${line.1}
message.action = ${line.2}
message.parameters = ${line.3}
The regex and such would change if parameters were “the rest of the line”, but that’s the last (\S+), the approach would remain the same.
Hope this helps! Cheers,
Matt
From: Jeroen Jacobs <je...@headincloud.be>
Reply-To: <us...@nifi.apache.org>
Date: Sunday, January 31, 2016 at 7:49 PM
To: "users@nifi.apache.org" <us...@nifi.apache.org>
Subject: Struggling with ExtractText
Hi,
I'm looking at the documentation for the ReplaceText processor, and I'm at a total loss on how I should get started. I have a flowfile with the following content:
/opsbot url logstash
I want to end up with the following attributes:
message.destination: opsbot
message.action: url
message.parameters: logstash
(an action can have multiple parameters. If that's the case, they should all end up in message.parameters).
How do I get started on this? Do I need to create 3 different attributes, each with their own regex? This doesn't seem efficient. Or can I do this in one go? If so, can someone explain me how, and what the correct regex should be?
A few examples on the documentation page could make this a lot more understandable to people who are new to NiFi.
Kind regards,
Jeroen