You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Gop Krr <go...@gmail.com> on 2016/10/27 21:46:33 UTC

nifi is running out of memory

Hi All,

I have very simple data flow, where I need to move s3 data from one bucket
in one account to another bucket under another account. I have attached my
processor configuration.


2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow
Service Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space

I am very new to NiFi and trying ot get few of the use cases going. I need
help from the community.

Thanks again

Rai

Re: nifi is running out of memory

Posted by Gop Krr <go...@gmail.com>.
Thanks Joe for checking. Yes, I got past it and I was successfully able to
demo it to the team :) Now, the next challenge is to drive the performance
out of nifi for the high throughput.

On Mon, Oct 31, 2016 at 7:08 PM, Joe Witt <jo...@gmail.com> wrote:

> Krish,
>
> Did you ever get past this?
>
> Thanks
> Joe
>
> On Fri, Oct 28, 2016 at 2:36 PM, Gop Krr <go...@gmail.com> wrote:
> > James, permission issue got resolved. I still don't see any write.
> >
> > On Fri, Oct 28, 2016 at 10:34 AM, Gop Krr <go...@gmail.com> wrote:
> >>
> >> Thanks James.. I am looking into permission issue and update the
> thread. I
> >> will also make the changes as you per your recommendation.
> >>
> >> On Fri, Oct 28, 2016 at 10:23 AM, James Wing <jv...@gmail.com> wrote:
> >>>
> >>> From the screenshot and the error message, I interpret the sequence of
> >>> events to be something like this:
> >>>
> >>> 1.) ListS3 succeeds and generates flowfiles with attributes referencing
> >>> S3 objects, but no content (0 bytes)
> >>> 2.) FetchS3Object fails to pull the S3 object content with an Access
> >>> Denied error, but the failed flowfiles are routed on to PutS3Object
> (35,179
> >>> files / 0 bytes in the "putconnector" queue)
> >>> 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3
> >>>
> >>> I recommend a couple thing for FetchS3Object:
> >>>
> >>> * Only allow the "success" relationship to continue to PutS3Object.
> >>> Separate the "failure" relationship to either loop back to
> FetchS3Object or
> >>> go to a LogAttibute processor, or other handling path.
> >>> * It looks like the permissions aren't working, you might want to
> >>> double-check the access keys or try a sample file with the AWS CLI.
> >>>
> >>> Thanks,
> >>>
> >>> James
> >>>
> >>>
> >>> On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <go...@gmail.com> wrote:
> >>>>
> >>>> This is how my nifi flow looks like.
> >>>>
> >>>> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <go...@gmail.com> wrote:
> >>>>>
> >>>>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by
> switching
> >>>>> to 0.71.  Now it is able to list the files from buckets and create
> those
> >>>>> files in the another bucket. But write is not happening and I am
> getting the
> >>>>> permission issue ( I have attached below for the reference) Could
> this be
> >>>>> the setting of the buckets or it has more to do with the access key.
> All the
> >>>>> files which are creaetd in the new bucket are of 0 byte.
> >>>>> Thanks
> >>>>> Rai
> >>>>>
> >>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
> >>>>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx]
> Failed to
> >>>>> retrieve S3 Object for
> >>>>> StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=
> xxxxx.gz,size=0];
> >>>>> routing to failure: com.amazonaws.services.s3.
> model.AmazonS3Exception:
> >>>>> Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
> >>>>> AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID:
> >>>>> lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6
> DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
> >>>>>
> >>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
> >>>>> o.a.nifi.processors.aws.s3.FetchS3Object
> >>>>>
> >>>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
> >>>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
> Request ID:
> >>>>> 0F34E71C0697B1D8)
> >>>>>
> >>>>>         at
> >>>>> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(
> AmazonHttpClient.java:1219)
> >>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
> >>>>>
> >>>>>         at
> >>>>> com.amazonaws.http.AmazonHttpClient.executeOneRequest(
> AmazonHttpClient.java:803)
> >>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
> >>>>>
> >>>>>         at
> >>>>> com.amazonaws.http.AmazonHttpClient.executeHelper(
> AmazonHttpClient.java:505)
> >>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
> >>>>>
> >>>>>         at
> >>>>> com.amazonaws.http.AmazonHttpClient.execute(
> AmazonHttpClient.java:317)
> >>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
> >>>>>
> >>>>>         at
> >>>>> com.amazonaws.services.s3.AmazonS3Client.invoke(
> AmazonS3Client.java:3595)
> >>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
> >>>>>
> >>>>>         at
> >>>>> com.amazonaws.services.s3.AmazonS3Client.getObject(
> AmazonS3Client.java:1116)
> >>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
> >>>>>
> >>>>>         at
> >>>>> org.apache.nifi.processors.aws.s3.FetchS3Object.
> onTrigger(FetchS3Object.java:106)
> >>>>> ~[nifi-aws-processors-0.7.1.jar:0.7.1]
> >>>>>
> >>>>>         at
> >>>>> org.apache.nifi.processor.AbstractProcessor.onTrigger(
> AbstractProcessor.java:27)
> >>>>> [nifi-api-0.7.1.jar:0.7.1]
> >>>>>
> >>>>>         at
> >>>>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(
> StandardProcessorNode.java:1054)
> >>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
> >>>>>
> >>>>>         at
> >>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(
> ContinuallyRunProcessorTask.java:136)
> >>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
> >>>>>
> >>>>>         at
> >>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(
> ContinuallyRunProcessorTask.java:47)
> >>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
> >>>>>
> >>>>>         at
> >>>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.
> run(TimerDrivenSchedulingAgent.java:127)
> >>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
> >>>>>
> >>>>>         at
> >>>>> java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:511)
> >>>>> [na:1.8.0_101]
> >>>>>
> >>>>>         at
> >>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> >>>>> [na:1.8.0_101]
> >>>>>
> >>>>>         at
> >>>>> java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> >>>>> [na:1.8.0_101]
> >>>>>
> >>>>>         at
> >>>>> java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> >>>>> [na:1.8.0_101]
> >>>>>
> >>>>>         at
> >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> >>>>> [na:1.8.0_101]
> >>>>>
> >>>>>         at
> >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> >>>>> [na:1.8.0_101]
> >>>>>
> >>>>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
> >>>>>
> >>>>>
> >>>>> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard
> >>>>> <pi...@gmail.com> wrote:
> >>>>>>
> >>>>>> Quick remark: the fix has also been merged in master and will be in
> >>>>>> release 1.1.0.
> >>>>>>
> >>>>>> Pierre
> >>>>>>
> >>>>>> 2016-10-28 15:22 GMT+02:00 Gop Krr <go...@gmail.com>:
> >>>>>>>
> >>>>>>> Thanks Adam. I will try 0.7.1 and update the community on the
> >>>>>>> outcome. If it works then I can create a patch for 1.x
> >>>>>>> Thanks
> >>>>>>> Rai
> >>>>>>>
> >>>>>>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <ad...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Hey All,
> >>>>>>>>
> >>>>>>>> I believe OP is running into a bug fixed here:
> >>>>>>>> https://issues.apache.org/jira/browse/NIFI-2631
> >>>>>>>>
> >>>>>>>> Basically, ListS3 attempts to commit all the files it finds
> >>>>>>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
> >>>>>>>> addresses the issue. Looks like the fix is out in 0.7.1 but not
> yet
> >>>>>>>> in
> >>>>>>>> a 1.x release.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Adam
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>> > Looking at this line [1] makes me think the FetchS3 processor is
> >>>>>>>> > properly streaming the bytes directly to the content repository.
> >>>>>>>> >
> >>>>>>>> > Looking at the screenshot showing nothing out of the ListS3
> >>>>>>>> > processor
> >>>>>>>> > makes me think the bucket has so many things in it that the
> >>>>>>>> > processor
> >>>>>>>> > or associated library isn't handling that well and is just
> listing
> >>>>>>>> > everything with no mechanism of max buffer size.  Krish please
> try
> >>>>>>>> > with the largest heap you can and let us know what you see.
> >>>>>>>> >
> >>>>>>>> > [1]
> >>>>>>>> > https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/
> org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
> >>>>>>>> >
> >>>>>>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com>
> >>>>>>>> > wrote:
> >>>>>>>> >> moving dev to bcc
> >>>>>>>> >>
> >>>>>>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
> >>>>>>>> >> transfers and so is loading all into memory.  I've not verified
> >>>>>>>> >> this
> >>>>>>>> >> in the code yet but it seems quite likely.  Krish if you can
> >>>>>>>> >> verify
> >>>>>>>> >> that going with a larger heap gets you in the game can you
> please
> >>>>>>>> >> file
> >>>>>>>> >> a JIRA.
> >>>>>>>> >>
> >>>>>>>> >> Thanks
> >>>>>>>> >> Joe
> >>>>>>>> >>
> >>>>>>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bbende@gmail.com
> >
> >>>>>>>> >> wrote:
> >>>>>>>> >>> Hello,
> >>>>>>>> >>>
> >>>>>>>> >>> Are you running with all of the default settings?
> >>>>>>>> >>>
> >>>>>>>> >>> If so you would probably want to try increasing the memory
> >>>>>>>> >>> settings in
> >>>>>>>> >>> conf/bootstrap.conf.
> >>>>>>>> >>>
> >>>>>>>> >>> They default to 512mb, you may want to try bumping it up to
> >>>>>>>> >>> 1024mb.
> >>>>>>>> >>>
> >>>>>>>> >>> -Bryan
> >>>>>>>> >>>
> >>>>>>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com>
> >>>>>>>> >>> wrote:
> >>>>>>>> >>>>
> >>>>>>>> >>>> Hi All,
> >>>>>>>> >>>>
> >>>>>>>> >>>> I have very simple data flow, where I need to move s3 data
> from
> >>>>>>>> >>>> one bucket
> >>>>>>>> >>>> in one account to another bucket under another account. I
> have
> >>>>>>>> >>>> attached my
> >>>>>>>> >>>> processor configuration.
> >>>>>>>> >>>>
> >>>>>>>> >>>>
> >>>>>>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
> >>>>>>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread
> >>>>>>>> >>>> Thread[Flow Service
> >>>>>>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap
> >>>>>>>> >>>> space
> >>>>>>>> >>>>
> >>>>>>>> >>>> I am very new to NiFi and trying ot get few of the use cases
> >>>>>>>> >>>> going. I need
> >>>>>>>> >>>> help from the community.
> >>>>>>>> >>>>
> >>>>>>>> >>>> Thanks again
> >>>>>>>> >>>>
> >>>>>>>> >>>> Rai
> >>>>>>>> >>>>
> >>>>>>>> >>>>
> >>>>>>>> >>>>
> >>>>>>>> >>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Re: nifi is running out of memory

Posted by Joe Witt <jo...@gmail.com>.
Krish,

Did you ever get past this?

Thanks
Joe

On Fri, Oct 28, 2016 at 2:36 PM, Gop Krr <go...@gmail.com> wrote:
> James, permission issue got resolved. I still don't see any write.
>
> On Fri, Oct 28, 2016 at 10:34 AM, Gop Krr <go...@gmail.com> wrote:
>>
>> Thanks James.. I am looking into permission issue and update the thread. I
>> will also make the changes as you per your recommendation.
>>
>> On Fri, Oct 28, 2016 at 10:23 AM, James Wing <jv...@gmail.com> wrote:
>>>
>>> From the screenshot and the error message, I interpret the sequence of
>>> events to be something like this:
>>>
>>> 1.) ListS3 succeeds and generates flowfiles with attributes referencing
>>> S3 objects, but no content (0 bytes)
>>> 2.) FetchS3Object fails to pull the S3 object content with an Access
>>> Denied error, but the failed flowfiles are routed on to PutS3Object (35,179
>>> files / 0 bytes in the "putconnector" queue)
>>> 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3
>>>
>>> I recommend a couple thing for FetchS3Object:
>>>
>>> * Only allow the "success" relationship to continue to PutS3Object.
>>> Separate the "failure" relationship to either loop back to FetchS3Object or
>>> go to a LogAttibute processor, or other handling path.
>>> * It looks like the permissions aren't working, you might want to
>>> double-check the access keys or try a sample file with the AWS CLI.
>>>
>>> Thanks,
>>>
>>> James
>>>
>>>
>>> On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <go...@gmail.com> wrote:
>>>>
>>>> This is how my nifi flow looks like.
>>>>
>>>> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <go...@gmail.com> wrote:
>>>>>
>>>>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching
>>>>> to 0.71.  Now it is able to list the files from buckets and create those
>>>>> files in the another bucket. But write is not happening and I am getting the
>>>>> permission issue ( I have attached below for the reference) Could this be
>>>>> the setting of the buckets or it has more to do with the access key. All the
>>>>> files which are creaetd in the new bucket are of 0 byte.
>>>>> Thanks
>>>>> Rai
>>>>>
>>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>>>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed to
>>>>> retrieve S3 Object for
>>>>> StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=xxxxx.gz,size=0];
>>>>> routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception:
>>>>> Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
>>>>> AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID:
>>>>> lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
>>>>>
>>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>>>> o.a.nifi.processors.aws.s3.FetchS3Object
>>>>>
>>>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>>>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID:
>>>>> 0F34E71C0697B1D8)
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
>>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
>>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
>>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
>>>>> ~[nifi-aws-processors-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>>>>> [nifi-api-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127)
>>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at
>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>> [na:1.8.0_101]
>>>>>
>>>>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
>>>>>
>>>>>
>>>>> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard
>>>>> <pi...@gmail.com> wrote:
>>>>>>
>>>>>> Quick remark: the fix has also been merged in master and will be in
>>>>>> release 1.1.0.
>>>>>>
>>>>>> Pierre
>>>>>>
>>>>>> 2016-10-28 15:22 GMT+02:00 Gop Krr <go...@gmail.com>:
>>>>>>>
>>>>>>> Thanks Adam. I will try 0.7.1 and update the community on the
>>>>>>> outcome. If it works then I can create a patch for 1.x
>>>>>>> Thanks
>>>>>>> Rai
>>>>>>>
>>>>>>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <ad...@gmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hey All,
>>>>>>>>
>>>>>>>> I believe OP is running into a bug fixed here:
>>>>>>>> https://issues.apache.org/jira/browse/NIFI-2631
>>>>>>>>
>>>>>>>> Basically, ListS3 attempts to commit all the files it finds
>>>>>>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>>>>>>>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet
>>>>>>>> in
>>>>>>>> a 1.x release.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Adam
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com>
>>>>>>>> wrote:
>>>>>>>> > Looking at this line [1] makes me think the FetchS3 processor is
>>>>>>>> > properly streaming the bytes directly to the content repository.
>>>>>>>> >
>>>>>>>> > Looking at the screenshot showing nothing out of the ListS3
>>>>>>>> > processor
>>>>>>>> > makes me think the bucket has so many things in it that the
>>>>>>>> > processor
>>>>>>>> > or associated library isn't handling that well and is just listing
>>>>>>>> > everything with no mechanism of max buffer size.  Krish please try
>>>>>>>> > with the largest heap you can and let us know what you see.
>>>>>>>> >
>>>>>>>> > [1]
>>>>>>>> > https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>>>>>>>> >
>>>>>>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com>
>>>>>>>> > wrote:
>>>>>>>> >> moving dev to bcc
>>>>>>>> >>
>>>>>>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>>>>>>>> >> transfers and so is loading all into memory.  I've not verified
>>>>>>>> >> this
>>>>>>>> >> in the code yet but it seems quite likely.  Krish if you can
>>>>>>>> >> verify
>>>>>>>> >> that going with a larger heap gets you in the game can you please
>>>>>>>> >> file
>>>>>>>> >> a JIRA.
>>>>>>>> >>
>>>>>>>> >> Thanks
>>>>>>>> >> Joe
>>>>>>>> >>
>>>>>>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com>
>>>>>>>> >> wrote:
>>>>>>>> >>> Hello,
>>>>>>>> >>>
>>>>>>>> >>> Are you running with all of the default settings?
>>>>>>>> >>>
>>>>>>>> >>> If so you would probably want to try increasing the memory
>>>>>>>> >>> settings in
>>>>>>>> >>> conf/bootstrap.conf.
>>>>>>>> >>>
>>>>>>>> >>> They default to 512mb, you may want to try bumping it up to
>>>>>>>> >>> 1024mb.
>>>>>>>> >>>
>>>>>>>> >>> -Bryan
>>>>>>>> >>>
>>>>>>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com>
>>>>>>>> >>> wrote:
>>>>>>>> >>>>
>>>>>>>> >>>> Hi All,
>>>>>>>> >>>>
>>>>>>>> >>>> I have very simple data flow, where I need to move s3 data from
>>>>>>>> >>>> one bucket
>>>>>>>> >>>> in one account to another bucket under another account. I have
>>>>>>>> >>>> attached my
>>>>>>>> >>>> processor configuration.
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>>>>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread
>>>>>>>> >>>> Thread[Flow Service
>>>>>>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap
>>>>>>>> >>>> space
>>>>>>>> >>>>
>>>>>>>> >>>> I am very new to NiFi and trying ot get few of the use cases
>>>>>>>> >>>> going. I need
>>>>>>>> >>>> help from the community.
>>>>>>>> >>>>
>>>>>>>> >>>> Thanks again
>>>>>>>> >>>>
>>>>>>>> >>>> Rai
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>>
>>>>>>>> >>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: nifi is running out of memory

Posted by Gop Krr <go...@gmail.com>.
James, permission issue got resolved. I still don't see any write.

On Fri, Oct 28, 2016 at 10:34 AM, Gop Krr <go...@gmail.com> wrote:

> Thanks James.. I am looking into permission issue and update the thread. I
> will also make the changes as you per your recommendation.
>
> On Fri, Oct 28, 2016 at 10:23 AM, James Wing <jv...@gmail.com> wrote:
>
>> From the screenshot and the error message, I interpret the sequence of
>> events to be something like this:
>>
>> 1.) ListS3 succeeds and generates flowfiles with attributes referencing
>> S3 objects, but no content (0 bytes)
>> 2.) FetchS3Object fails to pull the S3 object content with an Access
>> Denied error, but the failed flowfiles are routed on to PutS3Object (35,179
>> files / 0 bytes in the "putconnector" queue)
>> 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3
>>
>> I recommend a couple thing for FetchS3Object:
>>
>> * Only allow the "success" relationship to continue to PutS3Object.
>> Separate the "failure" relationship to either loop back to FetchS3Object or
>> go to a LogAttibute processor, or other handling path.
>> * It looks like the permissions aren't working, you might want to
>> double-check the access keys or try a sample file with the AWS CLI.
>>
>> Thanks,
>>
>> James
>>
>>
>> On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <go...@gmail.com> wrote:
>>
>>> This is how my nifi flow looks like.
>>>
>>> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <go...@gmail.com> wrote:
>>>
>>>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching
>>>> to 0.71.  Now it is able to list the files from buckets and create those
>>>> files in the another bucket. But write is not happening and I am getting
>>>> the permission issue ( I have attached below for the reference) Could this
>>>> be the setting of the buckets or it has more to do with the access key. All
>>>> the files which are creaetd in the new bucket are of 0 byte.
>>>> Thanks
>>>> Rai
>>>>
>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx]
>>>> Failed to retrieve S3 Object for StandardFlowFileRecord[uuid=yy
>>>> yyy,claim=,offset=0,name=xxxxx.gz,size=0]; routing to failure:
>>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
>>>> Request ID: xxxxxxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6r
>>>> IQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
>>>>
>>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>>> o.a.nifi.processors.aws.s3.FetchS3Object
>>>>
>>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
>>>> Request ID: 0F34E71C0697B1D8)
>>>>
>>>>         at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>
>>>>         at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>
>>>>         at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>
>>>>         at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
>>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>>
>>>>         at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>>
>>>>         at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
>>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>>
>>>>         at org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
>>>> ~[nifi-aws-processors-0.7.1.jar:0.7.1]
>>>>
>>>>         at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>>>> [nifi-api-0.7.1.jar:0.7.1]
>>>>
>>>>         at org.apache.nifi.controller.StandardProcessorNode.onTrigger(S
>>>> tandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>
>>>>         at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>>>> .call(ContinuallyRunProcessorTask.java:136)
>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>
>>>>         at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>>>> .call(ContinuallyRunProcessorTask.java:47)
>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>
>>>>         at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingA
>>>> gent$1.run(TimerDrivenSchedulingAgent.java:127)
>>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>>
>>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>> [na:1.8.0_101]
>>>>
>>>>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>>> [na:1.8.0_101]
>>>>
>>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>>>> tureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>>>> [na:1.8.0_101]
>>>>
>>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>>>> tureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]
>>>>
>>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>> [na:1.8.0_101]
>>>>
>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>> [na:1.8.0_101]
>>>>
>>>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
>>>>
>>>> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <
>>>> pierre.villard.fr@gmail.com> wrote:
>>>>
>>>>> Quick remark: the fix has also been merged in master and will be in
>>>>> release 1.1.0.
>>>>>
>>>>> Pierre
>>>>>
>>>>> 2016-10-28 15:22 GMT+02:00 Gop Krr <go...@gmail.com>:
>>>>>
>>>>>> Thanks Adam. I will try 0.7.1 and update the community on the
>>>>>> outcome. If it works then I can create a patch for 1.x
>>>>>> Thanks
>>>>>> Rai
>>>>>>
>>>>>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <ad...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hey All,
>>>>>>>
>>>>>>> I believe OP is running into a bug fixed here:
>>>>>>> https://issues.apache.org/jira/browse/NIFI-2631
>>>>>>>
>>>>>>> Basically, ListS3 attempts to commit all the files it finds
>>>>>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>>>>>>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet
>>>>>>> in
>>>>>>> a 1.x release.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Adam
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com>
>>>>>>> wrote:
>>>>>>> > Looking at this line [1] makes me think the FetchS3 processor is
>>>>>>> > properly streaming the bytes directly to the content repository.
>>>>>>> >
>>>>>>> > Looking at the screenshot showing nothing out of the ListS3
>>>>>>> processor
>>>>>>> > makes me think the bucket has so many things in it that the
>>>>>>> processor
>>>>>>> > or associated library isn't handling that well and is just listing
>>>>>>> > everything with no mechanism of max buffer size.  Krish please try
>>>>>>> > with the largest heap you can and let us know what you see.
>>>>>>> >
>>>>>>> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
>>>>>>> nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache
>>>>>>> /nifi/processors/aws/s3/FetchS3Object.java#L107
>>>>>>> >
>>>>>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com>
>>>>>>> wrote:
>>>>>>> >> moving dev to bcc
>>>>>>> >>
>>>>>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>>>>>>> >> transfers and so is loading all into memory.  I've not verified
>>>>>>> this
>>>>>>> >> in the code yet but it seems quite likely.  Krish if you can
>>>>>>> verify
>>>>>>> >> that going with a larger heap gets you in the game can you please
>>>>>>> file
>>>>>>> >> a JIRA.
>>>>>>> >>
>>>>>>> >> Thanks
>>>>>>> >> Joe
>>>>>>> >>
>>>>>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com>
>>>>>>> wrote:
>>>>>>> >>> Hello,
>>>>>>> >>>
>>>>>>> >>> Are you running with all of the default settings?
>>>>>>> >>>
>>>>>>> >>> If so you would probably want to try increasing the memory
>>>>>>> settings in
>>>>>>> >>> conf/bootstrap.conf.
>>>>>>> >>>
>>>>>>> >>> They default to 512mb, you may want to try bumping it up to
>>>>>>> 1024mb.
>>>>>>> >>>
>>>>>>> >>> -Bryan
>>>>>>> >>>
>>>>>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com>
>>>>>>> wrote:
>>>>>>> >>>>
>>>>>>> >>>> Hi All,
>>>>>>> >>>>
>>>>>>> >>>> I have very simple data flow, where I need to move s3 data from
>>>>>>> one bucket
>>>>>>> >>>> in one account to another bucket under another account. I have
>>>>>>> attached my
>>>>>>> >>>> processor configuration.
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>>>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread
>>>>>>> Thread[Flow Service
>>>>>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap
>>>>>>> space
>>>>>>> >>>>
>>>>>>> >>>> I am very new to NiFi and trying ot get few of the use cases
>>>>>>> going. I need
>>>>>>> >>>> help from the community.
>>>>>>> >>>>
>>>>>>> >>>> Thanks again
>>>>>>> >>>>
>>>>>>> >>>> Rai
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>>
>>>>>>> >>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: nifi is running out of memory

Posted by Gop Krr <go...@gmail.com>.
Thanks James.. I am looking into permission issue and update the thread. I
will also make the changes as you per your recommendation.

On Fri, Oct 28, 2016 at 10:23 AM, James Wing <jv...@gmail.com> wrote:

> From the screenshot and the error message, I interpret the sequence of
> events to be something like this:
>
> 1.) ListS3 succeeds and generates flowfiles with attributes referencing S3
> objects, but no content (0 bytes)
> 2.) FetchS3Object fails to pull the S3 object content with an Access
> Denied error, but the failed flowfiles are routed on to PutS3Object (35,179
> files / 0 bytes in the "putconnector" queue)
> 3.) PutS3Object is succeeding, writing the 0 byte content from ListS3
>
> I recommend a couple thing for FetchS3Object:
>
> * Only allow the "success" relationship to continue to PutS3Object.
> Separate the "failure" relationship to either loop back to FetchS3Object or
> go to a LogAttibute processor, or other handling path.
> * It looks like the permissions aren't working, you might want to
> double-check the access keys or try a sample file with the AWS CLI.
>
> Thanks,
>
> James
>
>
> On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <go...@gmail.com> wrote:
>
>> This is how my nifi flow looks like.
>>
>> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <go...@gmail.com> wrote:
>>
>>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching
>>> to 0.71.  Now it is able to list the files from buckets and create those
>>> files in the another bucket. But write is not happening and I am getting
>>> the permission issue ( I have attached below for the reference) Could this
>>> be the setting of the buckets or it has more to do with the access key. All
>>> the files which are creaetd in the new bucket are of 0 byte.
>>> Thanks
>>> Rai
>>>
>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed
>>> to retrieve S3 Object for StandardFlowFileRecord[uuid=yy
>>> yyy,claim=,offset=0,name=xxxxx.gz,size=0]; routing to failure:
>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
>>> Request ID: xxxxxxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6r
>>> IQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
>>>
>>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>>> o.a.nifi.processors.aws.s3.FetchS3Object
>>>
>>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
>>> Request ID: 0F34E71C0697B1D8)
>>>
>>>         at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>
>>>         at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>
>>>         at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>
>>>         at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
>>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>>
>>>         at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>
>>>         at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
>>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>>
>>>         at org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
>>> ~[nifi-aws-processors-0.7.1.jar:0.7.1]
>>>
>>>         at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>>> [nifi-api-0.7.1.jar:0.7.1]
>>>
>>>         at org.apache.nifi.controller.StandardProcessorNode.onTrigger(S
>>> tandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]
>>>
>>>         at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>>> .call(ContinuallyRunProcessorTask.java:136)
>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>
>>>         at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>>> .call(ContinuallyRunProcessorTask.java:47)
>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>
>>>         at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingA
>>> gent$1.run(TimerDrivenSchedulingAgent.java:127)
>>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>>
>>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>> [na:1.8.0_101]
>>>
>>>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>>> [na:1.8.0_101]
>>>
>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>>> tureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]
>>>
>>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>>> tureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]
>>>
>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> [na:1.8.0_101]
>>>
>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> [na:1.8.0_101]
>>>
>>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
>>>
>>> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <
>>> pierre.villard.fr@gmail.com> wrote:
>>>
>>>> Quick remark: the fix has also been merged in master and will be in
>>>> release 1.1.0.
>>>>
>>>> Pierre
>>>>
>>>> 2016-10-28 15:22 GMT+02:00 Gop Krr <go...@gmail.com>:
>>>>
>>>>> Thanks Adam. I will try 0.7.1 and update the community on the outcome.
>>>>> If it works then I can create a patch for 1.x
>>>>> Thanks
>>>>> Rai
>>>>>
>>>>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <ad...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hey All,
>>>>>>
>>>>>> I believe OP is running into a bug fixed here:
>>>>>> https://issues.apache.org/jira/browse/NIFI-2631
>>>>>>
>>>>>> Basically, ListS3 attempts to commit all the files it finds
>>>>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>>>>>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
>>>>>> a 1.x release.
>>>>>>
>>>>>> Cheers,
>>>>>> Adam
>>>>>>
>>>>>>
>>>>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com> wrote:
>>>>>> > Looking at this line [1] makes me think the FetchS3 processor is
>>>>>> > properly streaming the bytes directly to the content repository.
>>>>>> >
>>>>>> > Looking at the screenshot showing nothing out of the ListS3
>>>>>> processor
>>>>>> > makes me think the bucket has so many things in it that the
>>>>>> processor
>>>>>> > or associated library isn't handling that well and is just listing
>>>>>> > everything with no mechanism of max buffer size.  Krish please try
>>>>>> > with the largest heap you can and let us know what you see.
>>>>>> >
>>>>>> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
>>>>>> nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache
>>>>>> /nifi/processors/aws/s3/FetchS3Object.java#L107
>>>>>> >
>>>>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com>
>>>>>> wrote:
>>>>>> >> moving dev to bcc
>>>>>> >>
>>>>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>>>>>> >> transfers and so is loading all into memory.  I've not verified
>>>>>> this
>>>>>> >> in the code yet but it seems quite likely.  Krish if you can verify
>>>>>> >> that going with a larger heap gets you in the game can you please
>>>>>> file
>>>>>> >> a JIRA.
>>>>>> >>
>>>>>> >> Thanks
>>>>>> >> Joe
>>>>>> >>
>>>>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com>
>>>>>> wrote:
>>>>>> >>> Hello,
>>>>>> >>>
>>>>>> >>> Are you running with all of the default settings?
>>>>>> >>>
>>>>>> >>> If so you would probably want to try increasing the memory
>>>>>> settings in
>>>>>> >>> conf/bootstrap.conf.
>>>>>> >>>
>>>>>> >>> They default to 512mb, you may want to try bumping it up to
>>>>>> 1024mb.
>>>>>> >>>
>>>>>> >>> -Bryan
>>>>>> >>>
>>>>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com>
>>>>>> wrote:
>>>>>> >>>>
>>>>>> >>>> Hi All,
>>>>>> >>>>
>>>>>> >>>> I have very simple data flow, where I need to move s3 data from
>>>>>> one bucket
>>>>>> >>>> in one account to another bucket under another account. I have
>>>>>> attached my
>>>>>> >>>> processor configuration.
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread
>>>>>> Thread[Flow Service
>>>>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap
>>>>>> space
>>>>>> >>>>
>>>>>> >>>> I am very new to NiFi and trying ot get few of the use cases
>>>>>> going. I need
>>>>>> >>>> help from the community.
>>>>>> >>>>
>>>>>> >>>> Thanks again
>>>>>> >>>>
>>>>>> >>>> Rai
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: nifi is running out of memory

Posted by James Wing <jv...@gmail.com>.
From the screenshot and the error message, I interpret the sequence of
events to be something like this:

1.) ListS3 succeeds and generates flowfiles with attributes referencing S3
objects, but no content (0 bytes)
2.) FetchS3Object fails to pull the S3 object content with an Access Denied
error, but the failed flowfiles are routed on to PutS3Object (35,179 files
/ 0 bytes in the "putconnector" queue)
3.) PutS3Object is succeeding, writing the 0 byte content from ListS3

I recommend a couple thing for FetchS3Object:

* Only allow the "success" relationship to continue to PutS3Object.
Separate the "failure" relationship to either loop back to FetchS3Object or
go to a LogAttibute processor, or other handling path.
* It looks like the permissions aren't working, you might want to
double-check the access keys or try a sample file with the AWS CLI.

Thanks,

James


On Fri, Oct 28, 2016 at 10:01 AM, Gop Krr <go...@gmail.com> wrote:

> This is how my nifi flow looks like.
>
> On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <go...@gmail.com> wrote:
>
>> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching
>> to 0.71.  Now it is able to list the files from buckets and create those
>> files in the another bucket. But write is not happening and I am getting
>> the permission issue ( I have attached below for the reference) Could this
>> be the setting of the buckets or it has more to do with the access key. All
>> the files which are creaetd in the new bucket are of 0 byte.
>> Thanks
>> Rai
>>
>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed
>> to retrieve S3 Object for StandardFlowFileRecord[uuid=yy
>> yyy,claim=,offset=0,name=xxxxx.gz,size=0]; routing to failure:
>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request
>> ID: xxxxxxx), S3 Extended Request ID: lu8tAqRxu+ouinnVvJleHkUUyK6J6r
>> IQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
>>
>> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
>> o.a.nifi.processors.aws.s3.FetchS3Object
>>
>> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
>> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request
>> ID: 0F34E71C0697B1D8)
>>
>>         at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>
>>         at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>
>>         at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>
>>         at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
>> ~[aws-java-sdk-core-1.10.32.jar:na]
>>
>>         at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>
>>         at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
>> ~[aws-java-sdk-s3-1.10.32.jar:na]
>>
>>         at org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
>> ~[nifi-aws-processors-0.7.1.jar:0.7.1]
>>
>>         at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
>> [nifi-api-0.7.1.jar:0.7.1]
>>
>>         at org.apache.nifi.controller.StandardProcessorNode.onTrigger(S
>> tandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]
>>
>>         at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>> .call(ContinuallyRunProcessorTask.java:136)
>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>
>>         at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
>> .call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.jar
>> :0.7.1]
>>
>>         at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingA
>> gent$1.run(TimerDrivenSchedulingAgent.java:127)
>> [nifi-framework-core-0.7.1.jar:0.7.1]
>>
>>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> [na:1.8.0_101]
>>
>>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>> [na:1.8.0_101]
>>
>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]
>>
>>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFu
>> tureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]
>>
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [na:1.8.0_101]
>>
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [na:1.8.0_101]
>>
>>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
>>
>> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <
>> pierre.villard.fr@gmail.com> wrote:
>>
>>> Quick remark: the fix has also been merged in master and will be in
>>> release 1.1.0.
>>>
>>> Pierre
>>>
>>> 2016-10-28 15:22 GMT+02:00 Gop Krr <go...@gmail.com>:
>>>
>>>> Thanks Adam. I will try 0.7.1 and update the community on the outcome.
>>>> If it works then I can create a patch for 1.x
>>>> Thanks
>>>> Rai
>>>>
>>>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <ad...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hey All,
>>>>>
>>>>> I believe OP is running into a bug fixed here:
>>>>> https://issues.apache.org/jira/browse/NIFI-2631
>>>>>
>>>>> Basically, ListS3 attempts to commit all the files it finds
>>>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>>>>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
>>>>> a 1.x release.
>>>>>
>>>>> Cheers,
>>>>> Adam
>>>>>
>>>>>
>>>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com> wrote:
>>>>> > Looking at this line [1] makes me think the FetchS3 processor is
>>>>> > properly streaming the bytes directly to the content repository.
>>>>> >
>>>>> > Looking at the screenshot showing nothing out of the ListS3 processor
>>>>> > makes me think the bucket has so many things in it that the processor
>>>>> > or associated library isn't handling that well and is just listing
>>>>> > everything with no mechanism of max buffer size.  Krish please try
>>>>> > with the largest heap you can and let us know what you see.
>>>>> >
>>>>> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
>>>>> nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache
>>>>> /nifi/processors/aws/s3/FetchS3Object.java#L107
>>>>> >
>>>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com>
>>>>> wrote:
>>>>> >> moving dev to bcc
>>>>> >>
>>>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>>>>> >> transfers and so is loading all into memory.  I've not verified this
>>>>> >> in the code yet but it seems quite likely.  Krish if you can verify
>>>>> >> that going with a larger heap gets you in the game can you please
>>>>> file
>>>>> >> a JIRA.
>>>>> >>
>>>>> >> Thanks
>>>>> >> Joe
>>>>> >>
>>>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com>
>>>>> wrote:
>>>>> >>> Hello,
>>>>> >>>
>>>>> >>> Are you running with all of the default settings?
>>>>> >>>
>>>>> >>> If so you would probably want to try increasing the memory
>>>>> settings in
>>>>> >>> conf/bootstrap.conf.
>>>>> >>>
>>>>> >>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>>> >>>
>>>>> >>> -Bryan
>>>>> >>>
>>>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com>
>>>>> wrote:
>>>>> >>>>
>>>>> >>>> Hi All,
>>>>> >>>>
>>>>> >>>> I have very simple data flow, where I need to move s3 data from
>>>>> one bucket
>>>>> >>>> in one account to another bucket under another account. I have
>>>>> attached my
>>>>> >>>> processor configuration.
>>>>> >>>>
>>>>> >>>>
>>>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread
>>>>> Thread[Flow Service
>>>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap
>>>>> space
>>>>> >>>>
>>>>> >>>> I am very new to NiFi and trying ot get few of the use cases
>>>>> going. I need
>>>>> >>>> help from the community.
>>>>> >>>>
>>>>> >>>> Thanks again
>>>>> >>>>
>>>>> >>>> Rai
>>>>> >>>>
>>>>> >>>>
>>>>> >>>>
>>>>> >>>
>>>>>
>>>>
>>>>
>>>
>>
>

Re: nifi is running out of memory

Posted by Gop Krr <go...@gmail.com>.
This is how my nifi flow looks like.

On Fri, Oct 28, 2016 at 9:57 AM, Gop Krr <go...@gmail.com> wrote:

> Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching to
> 0.71.  Now it is able to list the files from buckets and create those files
> in the another bucket. But write is not happening and I am getting the
> permission issue ( I have attached below for the reference) Could this be
> the setting of the buckets or it has more to do with the access key. All
> the files which are creaetd in the new bucket are of 0 byte.
> Thanks
> Rai
>
> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
> o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed
> to retrieve S3 Object for StandardFlowFileRecord[uuid=
> yyyyy,claim=,offset=0,name=xxxxx.gz,size=0]; routing to failure:
> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request
> ID: xxxxxxx), S3 Extended Request ID: lu8tAqRxu+
> ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=
>
> 2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
> o.a.nifi.processors.aws.s3.FetchS3Object
>
> com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied
> (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request
> ID: 0F34E71C0697B1D8)
>
>         at com.amazonaws.http.AmazonHttpClient.handleErrorRe
> sponse(AmazonHttpClient.java:1219) ~[aws-java-sdk-core-1.10.32.jar:na]
>
>         at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
> ~[aws-java-sdk-core-1.10.32.jar:na]
>
>         at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
> ~[aws-java-sdk-core-1.10.32.jar:na]
>
>         at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
> ~[aws-java-sdk-core-1.10.32.jar:na]
>
>         at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
> ~[aws-java-sdk-s3-1.10.32.jar:na]
>
>         at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
> ~[aws-java-sdk-s3-1.10.32.jar:na]
>
>         at org.apache.nifi.processors.aws.s3.FetchS3Object.
> onTrigger(FetchS3Object.java:106) ~[nifi-aws-processors-0.7.1.jar:0.7.1]
>
>         at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> [nifi-api-0.7.1.jar:0.7.1]
>
>         at org.apache.nifi.controller.StandardProcessorNode.onTrigger(
> StandardProcessorNode.java:1054) [nifi-framework-core-0.7.1.jar:0.7.1]
>
>         at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.
> call(ContinuallyRunProcessorTask.java:136) [nifi-framework-core-0.7.1.
> jar:0.7.1]
>
>         at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.
> call(ContinuallyRunProcessorTask.java:47) [nifi-framework-core-0.7.1.
> jar:0.7.1]
>
>         at org.apache.nifi.controller.scheduling.
> TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127)
> [nifi-framework-core-0.7.1.jar:0.7.1]
>
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_101]
>
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> [na:1.8.0_101]
>
>         at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> [na:1.8.0_101]
>
>         at java.util.concurrent.ScheduledThreadPoolExecutor$
> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> [na:1.8.0_101]
>
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_101]
>
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_101]
>
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
>
> On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <
> pierre.villard.fr@gmail.com> wrote:
>
>> Quick remark: the fix has also been merged in master and will be in
>> release 1.1.0.
>>
>> Pierre
>>
>> 2016-10-28 15:22 GMT+02:00 Gop Krr <go...@gmail.com>:
>>
>>> Thanks Adam. I will try 0.7.1 and update the community on the outcome.
>>> If it works then I can create a patch for 1.x
>>> Thanks
>>> Rai
>>>
>>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <ad...@gmail.com>
>>> wrote:
>>>
>>>> Hey All,
>>>>
>>>> I believe OP is running into a bug fixed here:
>>>> https://issues.apache.org/jira/browse/NIFI-2631
>>>>
>>>> Basically, ListS3 attempts to commit all the files it finds
>>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>>>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
>>>> a 1.x release.
>>>>
>>>> Cheers,
>>>> Adam
>>>>
>>>>
>>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com> wrote:
>>>> > Looking at this line [1] makes me think the FetchS3 processor is
>>>> > properly streaming the bytes directly to the content repository.
>>>> >
>>>> > Looking at the screenshot showing nothing out of the ListS3 processor
>>>> > makes me think the bucket has so many things in it that the processor
>>>> > or associated library isn't handling that well and is just listing
>>>> > everything with no mechanism of max buffer size.  Krish please try
>>>> > with the largest heap you can and let us know what you see.
>>>> >
>>>> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
>>>> nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache
>>>> /nifi/processors/aws/s3/FetchS3Object.java#L107
>>>> >
>>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com> wrote:
>>>> >> moving dev to bcc
>>>> >>
>>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>>>> >> transfers and so is loading all into memory.  I've not verified this
>>>> >> in the code yet but it seems quite likely.  Krish if you can verify
>>>> >> that going with a larger heap gets you in the game can you please
>>>> file
>>>> >> a JIRA.
>>>> >>
>>>> >> Thanks
>>>> >> Joe
>>>> >>
>>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com>
>>>> wrote:
>>>> >>> Hello,
>>>> >>>
>>>> >>> Are you running with all of the default settings?
>>>> >>>
>>>> >>> If so you would probably want to try increasing the memory settings
>>>> in
>>>> >>> conf/bootstrap.conf.
>>>> >>>
>>>> >>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>> >>>
>>>> >>> -Bryan
>>>> >>>
>>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:
>>>> >>>>
>>>> >>>> Hi All,
>>>> >>>>
>>>> >>>> I have very simple data flow, where I need to move s3 data from
>>>> one bucket
>>>> >>>> in one account to another bucket under another account. I have
>>>> attached my
>>>> >>>> processor configuration.
>>>> >>>>
>>>> >>>>
>>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread
>>>> Thread[Flow Service
>>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>> >>>>
>>>> >>>> I am very new to NiFi and trying ot get few of the use cases
>>>> going. I need
>>>> >>>> help from the community.
>>>> >>>>
>>>> >>>> Thanks again
>>>> >>>>
>>>> >>>> Rai
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>
>>>>
>>>
>>>
>>
>

Re: nifi is running out of memory

Posted by Gop Krr <go...@gmail.com>.
Thanks Bryan, Joe, Adam and Pierre. I went past this issue by switching to
0.71.  Now it is able to list the files from buckets and create those files
in the another bucket. But write is not happening and I am getting the
permission issue ( I have attached below for the reference) Could this be
the setting of the buckets or it has more to do with the access key. All
the files which are creaetd in the new bucket are of 0 byte.
Thanks
Rai

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
o.a.nifi.processors.aws.s3.FetchS3Object FetchS3Object[id=xxxxx] Failed to
retrieve S3 Object for
StandardFlowFileRecord[uuid=yyyyy,claim=,offset=0,name=xxxxx.gz,size=0];
routing to failure: com.amazonaws.services.s3.model.AmazonS3Exception:
Access Denied (Service: Amazon S3; Status Code: 403; Error Code:
AccessDenied; Request ID: xxxxxxx), S3 Extended Request ID:
lu8tAqRxu+ouinnVvJleHkUUyK6J6rIQCTw0G8G6DB6NOPGec0D1KB6cfUPsj08IQXI8idtiTp4=

2016-10-28 16:45:25,438 ERROR [Timer-Driven Process Thread-3]
o.a.nifi.processors.aws.s3.FetchS3Object

com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service:
Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID:
0F34E71C0697B1D8)

        at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1219)
~[aws-java-sdk-core-1.10.32.jar:na]

        at
com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:803)
~[aws-java-sdk-core-1.10.32.jar:na]

        at
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:505)
~[aws-java-sdk-core-1.10.32.jar:na]

        at
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:317)
~[aws-java-sdk-core-1.10.32.jar:na]

        at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3595)
~[aws-java-sdk-s3-1.10.32.jar:na]

        at
com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1116)
~[aws-java-sdk-s3-1.10.32.jar:na]

        at
org.apache.nifi.processors.aws.s3.FetchS3Object.onTrigger(FetchS3Object.java:106)
~[nifi-aws-processors-0.7.1.jar:0.7.1]

        at
org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
[nifi-api-0.7.1.jar:0.7.1]

        at
org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1054)
[nifi-framework-core-0.7.1.jar:0.7.1]

        at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:136)
[nifi-framework-core-0.7.1.jar:0.7.1]

        at
org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
[nifi-framework-core-0.7.1.jar:0.7.1]

        at
org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:127)
[nifi-framework-core-0.7.1.jar:0.7.1]

        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[na:1.8.0_101]

        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
[na:1.8.0_101]

        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
[na:1.8.0_101]

        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
[na:1.8.0_101]

        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_101]

        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_101]

        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]

On Fri, Oct 28, 2016 at 6:31 AM, Pierre Villard <pierre.villard.fr@gmail.com
> wrote:

> Quick remark: the fix has also been merged in master and will be in
> release 1.1.0.
>
> Pierre
>
> 2016-10-28 15:22 GMT+02:00 Gop Krr <go...@gmail.com>:
>
>> Thanks Adam. I will try 0.7.1 and update the community on the outcome. If
>> it works then I can create a patch for 1.x
>> Thanks
>> Rai
>>
>> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <ad...@gmail.com> wrote:
>>
>>> Hey All,
>>>
>>> I believe OP is running into a bug fixed here:
>>> https://issues.apache.org/jira/browse/NIFI-2631
>>>
>>> Basically, ListS3 attempts to commit all the files it finds
>>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
>>> a 1.x release.
>>>
>>> Cheers,
>>> Adam
>>>
>>>
>>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com> wrote:
>>> > Looking at this line [1] makes me think the FetchS3 processor is
>>> > properly streaming the bytes directly to the content repository.
>>> >
>>> > Looking at the screenshot showing nothing out of the ListS3 processor
>>> > makes me think the bucket has so many things in it that the processor
>>> > or associated library isn't handling that well and is just listing
>>> > everything with no mechanism of max buffer size.  Krish please try
>>> > with the largest heap you can and let us know what you see.
>>> >
>>> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
>>> nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache
>>> /nifi/processors/aws/s3/FetchS3Object.java#L107
>>> >
>>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com> wrote:
>>> >> moving dev to bcc
>>> >>
>>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>>> >> transfers and so is loading all into memory.  I've not verified this
>>> >> in the code yet but it seems quite likely.  Krish if you can verify
>>> >> that going with a larger heap gets you in the game can you please file
>>> >> a JIRA.
>>> >>
>>> >> Thanks
>>> >> Joe
>>> >>
>>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com>
>>> wrote:
>>> >>> Hello,
>>> >>>
>>> >>> Are you running with all of the default settings?
>>> >>>
>>> >>> If so you would probably want to try increasing the memory settings
>>> in
>>> >>> conf/bootstrap.conf.
>>> >>>
>>> >>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>> >>>
>>> >>> -Bryan
>>> >>>
>>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:
>>> >>>>
>>> >>>> Hi All,
>>> >>>>
>>> >>>> I have very simple data flow, where I need to move s3 data from one
>>> bucket
>>> >>>> in one account to another bucket under another account. I have
>>> attached my
>>> >>>> processor configuration.
>>> >>>>
>>> >>>>
>>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread
>>> Thread[Flow Service
>>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>> >>>>
>>> >>>> I am very new to NiFi and trying ot get few of the use cases going.
>>> I need
>>> >>>> help from the community.
>>> >>>>
>>> >>>> Thanks again
>>> >>>>
>>> >>>> Rai
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>
>>>
>>
>>
>

Re: nifi is running out of memory

Posted by Pierre Villard <pi...@gmail.com>.
Quick remark: the fix has also been merged in master and will be in release
1.1.0.

Pierre

2016-10-28 15:22 GMT+02:00 Gop Krr <go...@gmail.com>:

> Thanks Adam. I will try 0.7.1 and update the community on the outcome. If
> it works then I can create a patch for 1.x
> Thanks
> Rai
>
> On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <ad...@gmail.com> wrote:
>
>> Hey All,
>>
>> I believe OP is running into a bug fixed here:
>> https://issues.apache.org/jira/browse/NIFI-2631
>>
>> Basically, ListS3 attempts to commit all the files it finds
>> (potentially 100k+) at once, rather than in batches. NIFI-2631
>> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
>> a 1.x release.
>>
>> Cheers,
>> Adam
>>
>>
>> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com> wrote:
>> > Looking at this line [1] makes me think the FetchS3 processor is
>> > properly streaming the bytes directly to the content repository.
>> >
>> > Looking at the screenshot showing nothing out of the ListS3 processor
>> > makes me think the bucket has so many things in it that the processor
>> > or associated library isn't handling that well and is just listing
>> > everything with no mechanism of max buffer size.  Krish please try
>> > with the largest heap you can and let us know what you see.
>> >
>> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/
>> nifi-aws-bundle/nifi-aws-processors/src/main/java/org/
>> apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>> >
>> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com> wrote:
>> >> moving dev to bcc
>> >>
>> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> >> transfers and so is loading all into memory.  I've not verified this
>> >> in the code yet but it seems quite likely.  Krish if you can verify
>> >> that going with a larger heap gets you in the game can you please file
>> >> a JIRA.
>> >>
>> >> Thanks
>> >> Joe
>> >>
>> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com> wrote:
>> >>> Hello,
>> >>>
>> >>> Are you running with all of the default settings?
>> >>>
>> >>> If so you would probably want to try increasing the memory settings in
>> >>> conf/bootstrap.conf.
>> >>>
>> >>> They default to 512mb, you may want to try bumping it up to 1024mb.
>> >>>
>> >>> -Bryan
>> >>>
>> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:
>> >>>>
>> >>>> Hi All,
>> >>>>
>> >>>> I have very simple data flow, where I need to move s3 data from one
>> bucket
>> >>>> in one account to another bucket under another account. I have
>> attached my
>> >>>> processor configuration.
>> >>>>
>> >>>>
>> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow
>> Service
>> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>> >>>>
>> >>>> I am very new to NiFi and trying ot get few of the use cases going.
>> I need
>> >>>> help from the community.
>> >>>>
>> >>>> Thanks again
>> >>>>
>> >>>> Rai
>> >>>>
>> >>>>
>> >>>>
>> >>>
>>
>
>

Re: nifi is running out of memory

Posted by Gop Krr <go...@gmail.com>.
Thanks Adam. I will try 0.7.1 and update the community on the outcome. If
it works then I can create a patch for 1.x
Thanks
Rai

On Thu, Oct 27, 2016 at 7:41 PM, Adam Lamar <ad...@gmail.com> wrote:

> Hey All,
>
> I believe OP is running into a bug fixed here:
> https://issues.apache.org/jira/browse/NIFI-2631
>
> Basically, ListS3 attempts to commit all the files it finds
> (potentially 100k+) at once, rather than in batches. NIFI-2631
> addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
> a 1.x release.
>
> Cheers,
> Adam
>
>
> On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com> wrote:
> > Looking at this line [1] makes me think the FetchS3 processor is
> > properly streaming the bytes directly to the content repository.
> >
> > Looking at the screenshot showing nothing out of the ListS3 processor
> > makes me think the bucket has so many things in it that the processor
> > or associated library isn't handling that well and is just listing
> > everything with no mechanism of max buffer size.  Krish please try
> > with the largest heap you can and let us know what you see.
> >
> > [1] https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/
> org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
> >
> > On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com> wrote:
> >> moving dev to bcc
> >>
> >> Yes I believe the issue here is that FetchS3 doesn't do chunked
> >> transfers and so is loading all into memory.  I've not verified this
> >> in the code yet but it seems quite likely.  Krish if you can verify
> >> that going with a larger heap gets you in the game can you please file
> >> a JIRA.
> >>
> >> Thanks
> >> Joe
> >>
> >> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com> wrote:
> >>> Hello,
> >>>
> >>> Are you running with all of the default settings?
> >>>
> >>> If so you would probably want to try increasing the memory settings in
> >>> conf/bootstrap.conf.
> >>>
> >>> They default to 512mb, you may want to try bumping it up to 1024mb.
> >>>
> >>> -Bryan
> >>>
> >>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:
> >>>>
> >>>> Hi All,
> >>>>
> >>>> I have very simple data flow, where I need to move s3 data from one
> bucket
> >>>> in one account to another bucket under another account. I have
> attached my
> >>>> processor configuration.
> >>>>
> >>>>
> >>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
> >>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow
> Service
> >>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
> >>>>
> >>>> I am very new to NiFi and trying ot get few of the use cases going. I
> need
> >>>> help from the community.
> >>>>
> >>>> Thanks again
> >>>>
> >>>> Rai
> >>>>
> >>>>
> >>>>
> >>>
>

Re: nifi is running out of memory

Posted by Adam Lamar <ad...@gmail.com>.
Hey All,

I believe OP is running into a bug fixed here:
https://issues.apache.org/jira/browse/NIFI-2631

Basically, ListS3 attempts to commit all the files it finds
(potentially 100k+) at once, rather than in batches. NIFI-2631
addresses the issue. Looks like the fix is out in 0.7.1 but not yet in
a 1.x release.

Cheers,
Adam


On Thu, Oct 27, 2016 at 7:59 PM, Joe Witt <jo...@gmail.com> wrote:
> Looking at this line [1] makes me think the FetchS3 processor is
> properly streaming the bytes directly to the content repository.
>
> Looking at the screenshot showing nothing out of the ListS3 processor
> makes me think the bucket has so many things in it that the processor
> or associated library isn't handling that well and is just listing
> everything with no mechanism of max buffer size.  Krish please try
> with the largest heap you can and let us know what you see.
>
> [1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107
>
> On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com> wrote:
>> moving dev to bcc
>>
>> Yes I believe the issue here is that FetchS3 doesn't do chunked
>> transfers and so is loading all into memory.  I've not verified this
>> in the code yet but it seems quite likely.  Krish if you can verify
>> that going with a larger heap gets you in the game can you please file
>> a JIRA.
>>
>> Thanks
>> Joe
>>
>> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com> wrote:
>>> Hello,
>>>
>>> Are you running with all of the default settings?
>>>
>>> If so you would probably want to try increasing the memory settings in
>>> conf/bootstrap.conf.
>>>
>>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>>
>>> -Bryan
>>>
>>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> I have very simple data flow, where I need to move s3 data from one bucket
>>>> in one account to another bucket under another account. I have attached my
>>>> processor configuration.
>>>>
>>>>
>>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>>
>>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>>> help from the community.
>>>>
>>>> Thanks again
>>>>
>>>> Rai
>>>>
>>>>
>>>>
>>>

Re: nifi is running out of memory

Posted by Joe Witt <jo...@gmail.com>.
Looking at this line [1] makes me think the FetchS3 processor is
properly streaming the bytes directly to the content repository.

Looking at the screenshot showing nothing out of the ListS3 processor
makes me think the bucket has so many things in it that the processor
or associated library isn't handling that well and is just listing
everything with no mechanism of max buffer size.  Krish please try
with the largest heap you can and let us know what you see.

[1] https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/FetchS3Object.java#L107

On Thu, Oct 27, 2016 at 9:37 PM, Joe Witt <jo...@gmail.com> wrote:
> moving dev to bcc
>
> Yes I believe the issue here is that FetchS3 doesn't do chunked
> transfers and so is loading all into memory.  I've not verified this
> in the code yet but it seems quite likely.  Krish if you can verify
> that going with a larger heap gets you in the game can you please file
> a JIRA.
>
> Thanks
> Joe
>
> On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com> wrote:
>> Hello,
>>
>> Are you running with all of the default settings?
>>
>> If so you would probably want to try increasing the memory settings in
>> conf/bootstrap.conf.
>>
>> They default to 512mb, you may want to try bumping it up to 1024mb.
>>
>> -Bryan
>>
>> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:
>>>
>>> Hi All,
>>>
>>> I have very simple data flow, where I need to move s3 data from one bucket
>>> in one account to another bucket under another account. I have attached my
>>> processor configuration.
>>>
>>>
>>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>>
>>> I am very new to NiFi and trying ot get few of the use cases going. I need
>>> help from the community.
>>>
>>> Thanks again
>>>
>>> Rai
>>>
>>>
>>>
>>

Re: nifi is running out of memory

Posted by Joe Witt <jo...@gmail.com>.
moving dev to bcc

Yes I believe the issue here is that FetchS3 doesn't do chunked
transfers and so is loading all into memory.  I've not verified this
in the code yet but it seems quite likely.  Krish if you can verify
that going with a larger heap gets you in the game can you please file
a JIRA.

Thanks
Joe

On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com> wrote:
> Hello,
>
> Are you running with all of the default settings?
>
> If so you would probably want to try increasing the memory settings in
> conf/bootstrap.conf.
>
> They default to 512mb, you may want to try bumping it up to 1024mb.
>
> -Bryan
>
> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:
>>
>> Hi All,
>>
>> I have very simple data flow, where I need to move s3 data from one bucket
>> in one account to another bucket under another account. I have attached my
>> processor configuration.
>>
>>
>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>
>> I am very new to NiFi and trying ot get few of the use cases going. I need
>> help from the community.
>>
>> Thanks again
>>
>> Rai
>>
>>
>>
>

Re: nifi is running out of memory

Posted by Joe Witt <jo...@gmail.com>.
moving dev to bcc

Yes I believe the issue here is that FetchS3 doesn't do chunked
transfers and so is loading all into memory.  I've not verified this
in the code yet but it seems quite likely.  Krish if you can verify
that going with a larger heap gets you in the game can you please file
a JIRA.

Thanks
Joe

On Thu, Oct 27, 2016 at 9:34 PM, Bryan Bende <bb...@gmail.com> wrote:
> Hello,
>
> Are you running with all of the default settings?
>
> If so you would probably want to try increasing the memory settings in
> conf/bootstrap.conf.
>
> They default to 512mb, you may want to try bumping it up to 1024mb.
>
> -Bryan
>
> On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:
>>
>> Hi All,
>>
>> I have very simple data flow, where I need to move s3 data from one bucket
>> in one account to another bucket under another account. I have attached my
>> processor configuration.
>>
>>
>> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
>> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow Service
>> Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap space
>>
>> I am very new to NiFi and trying ot get few of the use cases going. I need
>> help from the community.
>>
>> Thanks again
>>
>> Rai
>>
>>
>>
>

Re: nifi is running out of memory

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

Are you running with all of the default settings?

If so you would probably want to try increasing the memory settings in
conf/bootstrap.conf.

They default to 512mb, you may want to try bumping it up to 1024mb.

-Bryan

On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:

> Hi All,
>
> I have very simple data flow, where I need to move s3 data from one bucket
> in one account to another bucket under another account. I have attached my
> processor configuration.
>
>
> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow
> Service Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap
> space
>
> I am very new to NiFi and trying ot get few of the use cases going. I need
> help from the community.
>
> Thanks again
>
> Rai
>
>
>
>

Re: nifi is running out of memory

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

Are you running with all of the default settings?

If so you would probably want to try increasing the memory settings in
conf/bootstrap.conf.

They default to 512mb, you may want to try bumping it up to 1024mb.

-Bryan

On Thu, Oct 27, 2016 at 5:46 PM, Gop Krr <go...@gmail.com> wrote:

> Hi All,
>
> I have very simple data flow, where I need to move s3 data from one bucket
> in one account to another bucket under another account. I have attached my
> processor configuration.
>
>
> 2016-10-27 20:09:57,626 ERROR [Flow Service Tasks Thread-2]
> org.apache.nifi.NiFi An Unknown Error Occurred in Thread Thread[Flow
> Service Tasks Thread-2,5,main]: java.lang.OutOfMemoryError: Java heap
> space
>
> I am very new to NiFi and trying ot get few of the use cases going. I need
> help from the community.
>
> Thanks again
>
> Rai
>
>
>
>