You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Mike Thomsen <mi...@gmail.com> on 2017/06/28 10:56:36 UTC

Getting reviews on two pull requests

Hi,

I put out two pull requests for the Mongo bundle based on use cases we ran
into when doing bulk ingestion with Mongo.

https://github.com/apache/nifi/pull/1948

This one allows GetMongo to bundle results into a large JSON array in a
flowfile based on user-defined limits. We needed it because GetMongo was
generating 1 flowfile/result and we were having to do a full transfer of a
Mongo cluster for processing with ElasticSearch (result would have been
well over 12M flowfiles).

https://github.com/apache/nifi/pull/1945

This one is a variant of PutMongoRecord that uses the Record API.

I have plenty of time to work out any changes in our code needed to make
these changes ready for merge if someone wants to take some time to look
over them and give me feedback.

Thanks,

Mike

Re: Getting reviews on two pull requests

Posted by Pierre Villard <pi...@gmail.com>.
Hey Mike,

I should be able to have a look at 1948. I already pulled a Mongo image on
my docker and I should have something running in the afternoon.

Pierre

2017-06-28 14:03 GMT+02:00 Mark Payne <ma...@hotmail.com>:

> Hey Mike,
>
> I think I can review 1945 today. I don't have much knowledge of Mongo but
> did a lot of the work on the record stuff. Hopefully I can find a good
> Mongo docker container for testing. If nobody else has reviewed the other
> one when I finish 1945 then I may have a chance to review that one today as
> well but not sure if I can get to both of them today or not.
>
> Thanks
> -Mark
>
> Sent from my iPhone
>
> > On Jun 28, 2017, at 6:56 AM, Mike Thomsen <mi...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > I put out two pull requests for the Mongo bundle based on use cases we
> ran
> > into when doing bulk ingestion with Mongo.
> >
> > https://github.com/apache/nifi/pull/1948
> >
> > This one allows GetMongo to bundle results into a large JSON array in a
> > flowfile based on user-defined limits. We needed it because GetMongo was
> > generating 1 flowfile/result and we were having to do a full transfer of
> a
> > Mongo cluster for processing with ElasticSearch (result would have been
> > well over 12M flowfiles).
> >
> > https://github.com/apache/nifi/pull/1945
> >
> > This one is a variant of PutMongoRecord that uses the Record API.
> >
> > I have plenty of time to work out any changes in our code needed to make
> > these changes ready for merge if someone wants to take some time to look
> > over them and give me feedback.
> >
> > Thanks,
> >
> > Mike
>

Re: Getting reviews on two pull requests

Posted by Mark Payne <ma...@hotmail.com>.
Hey Mike,

I think I can review 1945 today. I don't have much knowledge of Mongo but did a lot of the work on the record stuff. Hopefully I can find a good Mongo docker container for testing. If nobody else has reviewed the other one when I finish 1945 then I may have a chance to review that one today as well but not sure if I can get to both of them today or not. 

Thanks
-Mark

Sent from my iPhone

> On Jun 28, 2017, at 6:56 AM, Mike Thomsen <mi...@gmail.com> wrote:
> 
> Hi,
> 
> I put out two pull requests for the Mongo bundle based on use cases we ran
> into when doing bulk ingestion with Mongo.
> 
> https://github.com/apache/nifi/pull/1948
> 
> This one allows GetMongo to bundle results into a large JSON array in a
> flowfile based on user-defined limits. We needed it because GetMongo was
> generating 1 flowfile/result and we were having to do a full transfer of a
> Mongo cluster for processing with ElasticSearch (result would have been
> well over 12M flowfiles).
> 
> https://github.com/apache/nifi/pull/1945
> 
> This one is a variant of PutMongoRecord that uses the Record API.
> 
> I have plenty of time to work out any changes in our code needed to make
> these changes ready for merge if someone wants to take some time to look
> over them and give me feedback.
> 
> Thanks,
> 
> Mike