You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@nifi.apache.org by Selvam Raman <se...@gmail.com> on 2016/09/20 18:54:18 UTC

Re: PutS3 object returns jvm out of memory or disk out of memory

I have 500+ HTTP request and that will return files which has various size
that will be stored into s3..

For each http (oai-pmh) request we will get file to put into s3.

So content repository keep on increasing for the file size. One sudden
point it reaches 4.6 GB  and that's the avaible disk space in my machine.

I do not know why content repository keeps file though I mentioned
content.archive is false.

I don't know how to limit content repository file size. Suppose if I am
going to put 1 TB Of data to s3 then do I need 1 TB of content repository.
I was clueless.

Thanks,
Selvam R
+91-97877-87724
On Sep 20, 2016 5:22 PM, "Aldrin Piri" <al...@gmail.com> wrote:

> Hi Selvam,
>
> As mentioned, please keep messages to the one list. Moving dev to bcc
> again.
>
> Archiving is only applicable for that content which has exited the flow
> and is not referenced by any FlowFiles currently in your processing graph,
> similar to garbage collection in Java.  For this particular instance,
> unless there is content already on disk, this would likely not provide a
> remedy.
>
> The image did not show for me in my mail client, but was able to locate it
> at a list archive:  http://apache-nifi.1125220.n5.nabble.com/attachment/
> 12226/0/image.png
>
> That error shows InvokeHTTP providing an error.  Could you clarify if this
> is happening just on that processor or also on the previously mentioned
> PutS3?
>
> Could you possibly provide a template of your flow for inspection or
> provide more details about what it is doing?  Are there connections with
> large queues?  Does a "df -h" show that your instance partition is
> exhausted?
>
> NiFi will continuously bring data into the system and depending on what
> you are doing, will continue until disk space is exhausted which seems to
> be the issue at hand.  NiFi provides facilities to aid in avoiding
> situations such as these inclusive of backpressure and FlowFile
> expiration.  Upon introducing content into a flow, NiFi holds onto this
> until it finishes its path through the flow or is expunged via expiration
> making it eligible for removal and/or archival from the backing content
> repository.
>
> Thanks!
>
> On Tue, Sep 20, 2016 at 12:05 PM, Selvam Raman <se...@gmail.com> wrote:
>
>> In my case it is going out of disk space.
>>
>> i set nifi.content.repository.archive.enabled=false. (when i changed this
>> have restarted nifi cluster )
>>
>> But still i can see the processor keep on writing here on the disk.
>>
>> On Tue, Sep 20, 2016 at 4:34 PM, Joe Witt <jo...@gmail.com> wrote:
>>
>> > Hello
>> >
>> > Please only post to one list.  I have moved 'dev@nifi' to bcc.
>> >
>> > In the docs for this processor [1] you'll find reference to "Multipart
>> > Part Size".  Set that to a smaller value appropriate for your JVM
>> > memory settings.  For instance, if you have a default JVM heap size of
>> > 512MB you'll want something far smaller like 50MB.  At least I suspect
>> > this is the issue.
>> >
>> > [1] https://nifi.apache.org/docs/nifi-docs/components/org.
>> > apache.nifi.processors.aws.s3.PutS3Object/index.html
>> >
>> > On Tue, Sep 20, 2016 at 11:30 AM, Selvam Raman <se...@gmail.com>
>> wrote:
>> > > HI,
>> > >
>> > > I am pushing data to s3  using puts3object. I have setup nifi 1.0 zero
>> > > master cluster.
>> > >
>> > > Ec2 instance having only 8GB of hard disk. Content repository writing
>> > till
>> > > 4.6 gb of data then it throws jvm out of memory error.
>> > >
>> > > I changed nifi.properties for nifi.content.archive to false. but still
>> > it is
>> > > keep on writing.
>> > >
>> > > please help me.
>> > >
>> > > --
>> > > Selvam Raman
>> > > "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>> >
>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>
>