You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Mike Thomsen <mi...@gmail.com> on 2019/02/20 13:08:34 UTC

How do you use ElasticSearch with NiFi?

I'm looking for feedback from ElasticSearch users on how they use and how
they **want** to use ElasticSearch v5 and newer with NiFi.

So please respond with some use cases and what you want, what frustrates
you, etc. so I can prioritize Jira tickets for the ElasticSearch REST API
bundle.

(Note: basic JSON DSL queries are already supported via
JsonQueryElasticSearch. If you didn't know that, please try it out and drop
some feedback on what is needed to make it work for your use cases.)

Thanks,

Mike

Re: How do you use ElasticSearch with NiFi?

Posted by Mike Thomsen <mi...@gmail.com>.
> I haven't used any special processor dedicated to ES, just HTTP Request
processor,

Why did you decide to do that?

Thanks,

Mike

On Wed, Feb 20, 2019 at 11:55 AM Luis Carmona <lc...@openpartner.cl>
wrote:

> Hi everyone,
>
> I've been using Nifi for the last 6 months interacting with ES 6.X. Made
> queries, read and write data to it. All of that in production environments.
>
> I haven't used any special processor dedicated to ES, just HTTP Request
> processor, and everything has worked nicely. In terms of stress, the top
> has been 200 flowfile (50 KB each) in one call (SplitRecord) and queues
> worked perfectly. The only detail was to use insert/update with painless
> script is to use parameters, otherwise it crashes, but it is a ES issue.
>
> Now I'm trying to access ES through GraphQL, and Bulk inserts, but just in
> lab environment.
>
> Hope this info helps, and if you want I can keep posted the results of
> this last two topics.
>
> Regards,
>
> LC
>
>
>
> ------------------------------
> *De: *"Mike Thomsen" <mi...@gmail.com>
> *Para: *"users" <us...@nifi.apache.org>
> *Enviados: *MiƩrcoles, 20 de Febrero 2019 13:40:02
> *Asunto: *Re: How do you use ElasticSearch with NiFi?
>
> I've got a PR for a new bulk ingest processor, so I could easily add
> batching the record ingest to that plus something like your PR. I think it
> might be useful to have some enforcement mechanisms that prevent a request
> from being way too big. Last documentation I saw said about 32MB/payload.
> What do you think about that?
>
> On Wed, Feb 20, 2019 at 11:22 AM Joe Percivall <jp...@apache.org>
> wrote:
>
>> Hey Mike,
>> As a data point, we're ingesting into ES v6 using PutElasticsearchHttp
>> and PutElasticsearchHttpRecord. We do almost no querying of anything in ES
>> using NiFi. Continued improvement around ingesting into ES would be our
>> core use-case.
>>
>> One item that frustrated me was the issue around failures in the record
>> processor that I put up a PR here[1]. Another example of a potential
>> improvement would be to not load the entire request body (and thus all the
>> records/FF content) into memory when inserting into ES using those
>> processors. Not 100% sure how you would go about doing that but would be an
>> awesome improvement. Of course, any other improvements around performance
>> would also be welcome.
>>
>> [1] https://github.com/apache/nifi/pull/3299
>>
>> Cheers,
>> Joe
>>
>> On Wed, Feb 20, 2019 at 8:08 AM Mike Thomsen <mi...@gmail.com>
>> wrote:
>>
>>> I'm looking for feedback from ElasticSearch users on how they use and
>>> how they **want** to use ElasticSearch v5 and newer with NiFi.
>>>
>>> So please respond with some use cases and what you want, what frustrates
>>> you, etc. so I can prioritize Jira tickets for the ElasticSearch REST API
>>> bundle.
>>>
>>> (Note: basic JSON DSL queries are already supported via
>>> JsonQueryElasticSearch. If you didn't know that, please try it out and drop
>>> some feedback on what is needed to make it work for your use cases.)
>>>
>>> Thanks,
>>>
>>> Mike
>>>
>>
>>
>> --
>> *Joe Percivall*
>> linkedin.com/in/Percivall
>> e: jpercivall@apache.com
>>
>
>

Re: How do you use ElasticSearch with NiFi?

Posted by Luis Carmona <lc...@openpartner.cl>.
Hi everyone, 

I've been using Nifi for the last 6 months interacting with ES 6.X. Made queries, read and write data to it. All of that in production environments. 

I haven't used any special processor dedicated to ES, just HTTP Request processor, and everything has worked nicely. In terms of stress, the top has been 200 flowfile (50 KB each) in one call (SplitRecord) and queues worked perfectly. The only detail was to use insert/update with painless script is to use parameters, otherwise it crashes, but it is a ES issue. 

Now I'm trying to access ES through GraphQL, and Bulk inserts, but just in lab environment. 

Hope this info helps, and if you want I can keep posted the results of this last two topics. 

Regards, 

LC 




De: "Mike Thomsen" <mi...@gmail.com> 
Para: "users" <us...@nifi.apache.org> 
Enviados: MiƩrcoles, 20 de Febrero 2019 13:40:02 
Asunto: Re: How do you use ElasticSearch with NiFi? 

I've got a PR for a new bulk ingest processor, so I could easily add batching the record ingest to that plus something like your PR. I think it might be useful to have some enforcement mechanisms that prevent a request from being way too big. Last documentation I saw said about 32MB/payload. What do you think about that? 

On Wed, Feb 20, 2019 at 11:22 AM Joe Percivall < [ mailto:jpercivall@apache.org | jpercivall@apache.org ] > wrote: 



Hey Mike, 
As a data point, we're ingesting into ES v6 using PutElasticsearchHttp and PutElasticsearchHttpRecord. We do almost no querying of anything in ES using NiFi. Continued improvement around ingesting into ES would be our core use-case. 

One item that frustrated me was the issue around failures in the record processor that I put up a PR here[1]. Another example of a potential improvement would be to not load the entire request body (and thus all the records/FF content) into memory when inserting into ES using those processors. Not 100% sure how you would go about doing that but would be an awesome improvement. Of course, any other improvements around performance would also be welcome. 

[1] [ https://github.com/apache/nifi/pull/3299 | https://github.com/apache/nifi/pull/3299 ] 

Cheers, 
Joe 

On Wed, Feb 20, 2019 at 8:08 AM Mike Thomsen < [ mailto:mikerthomsen@gmail.com | mikerthomsen@gmail.com ] > wrote: 

BQ_BEGIN

I'm looking for feedback from ElasticSearch users on how they use and how they **want** to use ElasticSearch v5 and newer with NiFi. 

So please respond with some use cases and what you want, what frustrates you, etc. so I can prioritize Jira tickets for the ElasticSearch REST API bundle. 

(Note: basic JSON DSL queries are already supported via JsonQueryElasticSearch. If you didn't know that, please try it out and drop some feedback on what is needed to make it work for your use cases.) 

Thanks, 

Mike 





-- 
Joe Percivall 
[ http://linkedin.com/in/Percivall | linkedin.com/in/Percivall ] 
e: [ mailto:jpercivall@apache.com | jpercivall@apache.com ] 

BQ_END



Re: How do you use ElasticSearch with NiFi?

Posted by Mike Thomsen <mi...@gmail.com>.
I've got a PR for a new bulk ingest processor, so I could easily add
batching the record ingest to that plus something like your PR. I think it
might be useful to have some enforcement mechanisms that prevent a request
from being way too big. Last documentation I saw said about 32MB/payload.
What do you think about that?

On Wed, Feb 20, 2019 at 11:22 AM Joe Percivall <jp...@apache.org>
wrote:

> Hey Mike,
>
> As a data point, we're ingesting into ES v6 using PutElasticsearchHttp and
> PutElasticsearchHttpRecord. We do almost no querying of anything in ES
> using NiFi. Continued improvement around ingesting into ES would be our
> core use-case.
>
> One item that frustrated me was the issue around failures in the record
> processor that I put up a PR here[1]. Another example of a potential
> improvement would be to not load the entire request body (and thus all the
> records/FF content) into memory when inserting into ES using those
> processors. Not 100% sure how you would go about doing that but would be an
> awesome improvement. Of course, any other improvements around performance
> would also be welcome.
>
> [1] https://github.com/apache/nifi/pull/3299
>
> Cheers,
> Joe
>
> On Wed, Feb 20, 2019 at 8:08 AM Mike Thomsen <mi...@gmail.com>
> wrote:
>
>> I'm looking for feedback from ElasticSearch users on how they use and how
>> they **want** to use ElasticSearch v5 and newer with NiFi.
>>
>> So please respond with some use cases and what you want, what frustrates
>> you, etc. so I can prioritize Jira tickets for the ElasticSearch REST API
>> bundle.
>>
>> (Note: basic JSON DSL queries are already supported via
>> JsonQueryElasticSearch. If you didn't know that, please try it out and drop
>> some feedback on what is needed to make it work for your use cases.)
>>
>> Thanks,
>>
>> Mike
>>
>
>
> --
> *Joe Percivall*
> linkedin.com/in/Percivall
> e: jpercivall@apache.com
>

Re: How do you use ElasticSearch with NiFi?

Posted by Joe Percivall <jp...@apache.org>.
Hey Mike,

As a data point, we're ingesting into ES v6 using PutElasticsearchHttp and
PutElasticsearchHttpRecord. We do almost no querying of anything in ES
using NiFi. Continued improvement around ingesting into ES would be our
core use-case.

One item that frustrated me was the issue around failures in the record
processor that I put up a PR here[1]. Another example of a potential
improvement would be to not load the entire request body (and thus all the
records/FF content) into memory when inserting into ES using those
processors. Not 100% sure how you would go about doing that but would be an
awesome improvement. Of course, any other improvements around performance
would also be welcome.

[1] https://github.com/apache/nifi/pull/3299

Cheers,
Joe

On Wed, Feb 20, 2019 at 8:08 AM Mike Thomsen <mi...@gmail.com> wrote:

> I'm looking for feedback from ElasticSearch users on how they use and how
> they **want** to use ElasticSearch v5 and newer with NiFi.
>
> So please respond with some use cases and what you want, what frustrates
> you, etc. so I can prioritize Jira tickets for the ElasticSearch REST API
> bundle.
>
> (Note: basic JSON DSL queries are already supported via
> JsonQueryElasticSearch. If you didn't know that, please try it out and drop
> some feedback on what is needed to make it work for your use cases.)
>
> Thanks,
>
> Mike
>


-- 
*Joe Percivall*
linkedin.com/in/Percivall
e: jpercivall@apache.com