You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Jeff - Data Bean Australia <da...@gmail.com> on 2016/02/17 03:36:13 UTC

Generate URL based on different conditions

Hi,

I got a use case like this:

There are two files, say fileA and fileB, both of them contains multiple
lines of items and used for generate URLs. However, the algorithm for
generating URLs are different. If items come from fileA, the URL template
looks like this:

foo-<item>-foo

If items come from fileB, the template looks like this:

bar-<item>-foo-<item>-whatever

I am going to create a NiFi template to for the Data Flow from reading the
list file up to downloading data using InvokeHTTP, and place a
UpdateAttribute processor in front of the template to feed in different
file names (I have only two files).

The problem I have so far is how to generate the URLs based on different
input, so that I can make a general NiFi template for reusability.

Thanks,
Jeff



-- 
Data Bean - A Big Data Solution Provider in Australia.

Re: Generate URL based on different conditions

Posted by Jeff - Data Bean Australia <da...@gmail.com>.
Thank you Matt and Joe for your help.

On Wed, Feb 17, 2016 at 4:22 PM, Matt Burgess <ma...@gmail.com> wrote:

> Here's a Gist template that uses Joe's approach of RouteOnAttribute then
> UpdateAttribute to generate URLs with the use case you described:
> https://gist.github.com/mattyb149/8fd87efa13388888a70c
>
> On Tue, Feb 16, 2016 at 9:51 PM, Joe Witt <jo...@gmail.com> wrote:
>
>> Jeff,
>>
>> For each of the input files could it be that you would pull data from
>> multiple URLs?
>>
>> Have you had a chance to learn about the NiFi Expression language?
>> That will come in quite handy for constructing the URL used in
>> InvokeHTTP.
>>
>> The general pattern I think makes sense here is:
>> - Gather Data
>> - Extract Features from data to construct URL
>> - Fetch document/response from URL
>>
>> During 'Gather Data' you acquire the files.
>>
>> During 'Extract features' you pull out elements of the content of the
>> file into flow file attributes.  You can use RouteOnAttribute to send
>> to an UpdateAttribute processor which constructs a new attribute of
>> URL pattern A or URL pattern B respectively.  You can also collapse
>> that into a single UpdateAttribute possibly using the advanced UI and
>> set specific URLs based on patterns of attributes.  Lots of ways to
>> slice that.
>>
>> During Fetch document you should be able to just have a single
>> InvokeHTTP potentially which looks at some attribute you've defined
>> say 'the-url' and specify in InvokeHTTP the remote URL value to be
>> "${the-url}"
>>
>> We should publish a template for this pattern/approach if we've not
>> already but let's see how you progress and decide what would be most
>> useful for others.
>>
>> Thanks
>> Joe
>>
>> On Tue, Feb 16, 2016 at 9:36 PM, Jeff - Data Bean Australia
>> <da...@gmail.com> wrote:
>> > Hi,
>> >
>> > I got a use case like this:
>> >
>> > There are two files, say fileA and fileB, both of them contains multiple
>> > lines of items and used for generate URLs. However, the algorithm for
>> > generating URLs are different. If items come from fileA, the URL
>> template
>> > looks like this:
>> >
>> > foo-<item>-foo
>> >
>> > If items come from fileB, the template looks like this:
>> >
>> > bar-<item>-foo-<item>-whatever
>> >
>> > I am going to create a NiFi template to for the Data Flow from reading
>> the
>> > list file up to downloading data using InvokeHTTP, and place a
>> > UpdateAttribute processor in front of the template to feed in different
>> file
>> > names (I have only two files).
>> >
>> > The problem I have so far is how to generate the URLs based on different
>> > input, so that I can make a general NiFi template for reusability.
>> >
>> > Thanks,
>> > Jeff
>> >
>> >
>> >
>> > --
>> > Data Bean - A Big Data Solution Provider in Australia.
>>
>
>


-- 
Data Bean - A Big Data Solution Provider in Australia.

Re: Generate URL based on different conditions

Posted by Matt Burgess <ma...@gmail.com>.
Here's a Gist template that uses Joe's approach of RouteOnAttribute then
UpdateAttribute to generate URLs with the use case you described:
https://gist.github.com/mattyb149/8fd87efa13388888a70c

On Tue, Feb 16, 2016 at 9:51 PM, Joe Witt <jo...@gmail.com> wrote:

> Jeff,
>
> For each of the input files could it be that you would pull data from
> multiple URLs?
>
> Have you had a chance to learn about the NiFi Expression language?
> That will come in quite handy for constructing the URL used in
> InvokeHTTP.
>
> The general pattern I think makes sense here is:
> - Gather Data
> - Extract Features from data to construct URL
> - Fetch document/response from URL
>
> During 'Gather Data' you acquire the files.
>
> During 'Extract features' you pull out elements of the content of the
> file into flow file attributes.  You can use RouteOnAttribute to send
> to an UpdateAttribute processor which constructs a new attribute of
> URL pattern A or URL pattern B respectively.  You can also collapse
> that into a single UpdateAttribute possibly using the advanced UI and
> set specific URLs based on patterns of attributes.  Lots of ways to
> slice that.
>
> During Fetch document you should be able to just have a single
> InvokeHTTP potentially which looks at some attribute you've defined
> say 'the-url' and specify in InvokeHTTP the remote URL value to be
> "${the-url}"
>
> We should publish a template for this pattern/approach if we've not
> already but let's see how you progress and decide what would be most
> useful for others.
>
> Thanks
> Joe
>
> On Tue, Feb 16, 2016 at 9:36 PM, Jeff - Data Bean Australia
> <da...@gmail.com> wrote:
> > Hi,
> >
> > I got a use case like this:
> >
> > There are two files, say fileA and fileB, both of them contains multiple
> > lines of items and used for generate URLs. However, the algorithm for
> > generating URLs are different. If items come from fileA, the URL template
> > looks like this:
> >
> > foo-<item>-foo
> >
> > If items come from fileB, the template looks like this:
> >
> > bar-<item>-foo-<item>-whatever
> >
> > I am going to create a NiFi template to for the Data Flow from reading
> the
> > list file up to downloading data using InvokeHTTP, and place a
> > UpdateAttribute processor in front of the template to feed in different
> file
> > names (I have only two files).
> >
> > The problem I have so far is how to generate the URLs based on different
> > input, so that I can make a general NiFi template for reusability.
> >
> > Thanks,
> > Jeff
> >
> >
> >
> > --
> > Data Bean - A Big Data Solution Provider in Australia.
>

Re: Generate URL based on different conditions

Posted by Joe Witt <jo...@gmail.com>.
Jeff,

For each of the input files could it be that you would pull data from
multiple URLs?

Have you had a chance to learn about the NiFi Expression language?
That will come in quite handy for constructing the URL used in
InvokeHTTP.

The general pattern I think makes sense here is:
- Gather Data
- Extract Features from data to construct URL
- Fetch document/response from URL

During 'Gather Data' you acquire the files.

During 'Extract features' you pull out elements of the content of the
file into flow file attributes.  You can use RouteOnAttribute to send
to an UpdateAttribute processor which constructs a new attribute of
URL pattern A or URL pattern B respectively.  You can also collapse
that into a single UpdateAttribute possibly using the advanced UI and
set specific URLs based on patterns of attributes.  Lots of ways to
slice that.

During Fetch document you should be able to just have a single
InvokeHTTP potentially which looks at some attribute you've defined
say 'the-url' and specify in InvokeHTTP the remote URL value to be
"${the-url}"

We should publish a template for this pattern/approach if we've not
already but let's see how you progress and decide what would be most
useful for others.

Thanks
Joe

On Tue, Feb 16, 2016 at 9:36 PM, Jeff - Data Bean Australia
<da...@gmail.com> wrote:
> Hi,
>
> I got a use case like this:
>
> There are two files, say fileA and fileB, both of them contains multiple
> lines of items and used for generate URLs. However, the algorithm for
> generating URLs are different. If items come from fileA, the URL template
> looks like this:
>
> foo-<item>-foo
>
> If items come from fileB, the template looks like this:
>
> bar-<item>-foo-<item>-whatever
>
> I am going to create a NiFi template to for the Data Flow from reading the
> list file up to downloading data using InvokeHTTP, and place a
> UpdateAttribute processor in front of the template to feed in different file
> names (I have only two files).
>
> The problem I have so far is how to generate the URLs based on different
> input, so that I can make a general NiFi template for reusability.
>
> Thanks,
> Jeff
>
>
>
> --
> Data Bean - A Big Data Solution Provider in Australia.