You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by "dale.chang13" <da...@outlook.com> on 2016/04/29 15:12:13 UTC

FetchFile Cannot Allocate Enough Memory

I have been trying to run my data flow and I have been running into a problem
with being unable to read FetchFiles. I will detail my process below and I
would like some confirmation of my suspicions.

First I am ingesting an initial file that is fairly large, which contains
the path/filename of a ton of text files within another directory. The goal
is to read in the content of that large file, then read in the contents of
the thousands of text files, and then store the text file content into Solr.

The problem I am having is that the second FetchFile, the one that reads in
the smaller text files, frequently reports an error: /FileNotFoundException
xxx.txt (Cannot allocate memory); routing to failure/. This FetchFile runs
for about 20000 files and then continuously reports the above error for the
rest of the files.

My suspicion is of two concerns: not enough heap space vs. not enough
content_repo/flowfile_repo space. Any ideas or questions?



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: FetchFile Cannot Allocate Enough Memory

Posted by "dale.chang13" <da...@outlook.com>.
Mark Payne wrote
> Dale,
> 
> I think an image of the flow would be useful. Or better yet, if you can, a
> template of the flow, so
> that we can see all of the configuration being used.
> 
> When you said you "get stuck at around 20 MB and then NiFi moves to a
> crawl" I'm not clear on
> what you are saying exactly. After you process 20 MB of the 189 MB CSV
> file? After you ingest
> 20 MB worth of files via the second FetchFile?
> 
> Also, which directory has 85,000 files? The first directory being polled
> via ListFile, or the directory
> that you are picking up from via the second FetchFile?
> 
> Thanks
> -Mark

Attached below you should find the template the flow in question. I had to
remove or alter some information, but that does not impact the workflow. 

We have to ingest a CSV (called a DAT file) with variable delimiters and
qualifiers (so it's not really a CSV). The CSV has headers and many lines.
Each line corresponds to one text document. Each line also contains metadata
about and a URI to that document.
There are several folders which contain text files that are described by the
big CSV file. Two FetchFiles later in the flow will read attempt to find the
text documents corresponding to the URIs.

Here's a description of the directory structure:
*/dat* contains the gigantic CSV 
*/directory1* contains thousands of text documents that are described by the
CSV
*/directory2* contains additional documents described by the CSV
*/directory3*... and so on

Here are the steps to my flow:

1) The first List-Fetch File you will find in the first Process Group named
"Find and Read DATFile". The DAT file reads the CSV that contains hundreds
of thousands of lines.

2) The Split DATFile Process Group chunks the CSV file into individual
FlowFiles.

3) In the Clean/Extract Metadata Process Group, we have to use regular
expressions via ExtractText to write the metadata to FlowFile attributes, to
then use AttributesToJSON and then store those JSON documents to Solr. The
Processors in this group use regular expressions to clean and validate the
later generated JSON document.

4) The Read Extracted Text Process Group contains the second FetchFile that
reads in files according to the URIs listed in the CSV. This is where the
read/write speed dips ("NiFi moves to a crawl") once 20-30 MB of text files
have been read through the second FetchFile.

5) The Store in Solr Process Group batches up JSON documents and stores them
to SolrCloud.

Document_Ingestion_Redacted.xml
<http://apache-nifi-developer-list.39713.n7.nabble.com/file/n9916/Document_Ingestion_Redacted.xml>  



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720p9916.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: FetchFile Cannot Allocate Enough Memory

Posted by Mark Payne <ma...@hotmail.com>.
Dale,

I think an image of the flow would be useful. Or better yet, if you can, a template of the flow, so
that we can see all of the configuration being used.

When you said you "get stuck at around 20 MB and then NiFi moves to a crawl" I'm not clear on
what you are saying exactly. After you process 20 MB of the 189 MB CSV file? After you ingest
20 MB worth of files via the second FetchFile?

Also, which directory has 85,000 files? The first directory being polled via ListFile, or the directory
that you are picking up from via the second FetchFile?

Thanks
-Mark


> On May 4, 2016, at 9:01 AM, dale.chang13 <da...@outlook.com> wrote:
> 
> Joe Witt wrote
>> On May 4, 2016, at 8:56 AM, Joe Witt &lt;
> 
>> joe.witt@
> 
>> &gt; wrote:
>> 
>> Dale,
>> 
>> Where there is a fetch file there is usually a list file.  And while
>> the symptom of memory issues is showing up in fetch file i am curious
>> if the issue might actually be caused in ListFile.  How many files are
>> in the directory being listed?
>> 
>> Mark,
>> 
>> Are we using a stream friendly API to list files and do we know if
>> that API on all platforms really doing things in a stream friendly
>> way?
>> 
>> Thanks
>> Joe
> 
> So I will explain my flow first and then I will answer your question of how
> I am using ListFile and FetchFile.
> 
> To begin my process, I am ingesting a CSV file that contains a list of
> filenames. The first (and only ListFile) starts off the flow and passes it
> to the first FetchFile to retrieve the contents of the documents. Afterward,
> I use expression language (ExtractText) to extract all of the file names and
> put them as attributes to individual FlowFiles. THEN I use a second
> FetchFile (this is the processor that has trouble allocating memory) and use
> expression language to use that file name to retrieve a text document.
> 
> The CSV file (189 MB) contains metadata and path/filenames for over 200,000
> documents, and I am having trouble reading from a directory of about 85,000
> documents (second FetchFile, each document is usually a few KB). I get stuck
> at around 20 MB and then NiFi moves to a crawl.
> 
> I can give you a picture of our actual flow if you need it
> 
> 
> Mark Payne wrote
>> ListFile performs a listing using Java's File.listFiles(). This will
>> provide a list of all files in the
>> directory. I do not believe this to be related, though. Googling indicates
>> that when this error
>> occurs it is related to the ability to create a native process in order to
>> interact with the file system.
>> I don't think the issue is related to Java heap but rather available RAM
>> on the box. How much RAM
>> is actually available on the box? You mentioned IOPS - are you running in
>> a virtual cloud environment?
>> Using remote storage such as Amazon EBS?
> 
> I am running six Linux VMs on a Windows 8 machine. Three VMs (one ncm, two
> nodes) use NiFi and those VMs have 20 GB assigned to them. Looking through
> Ambari and monitoring the memory on the nodes, I have a little more than 4
> GB free RAM on the nodes. It looks like the free memory dipped severely
> during my NiFi flow, but no swap memory was used.
> 
> 
> 
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720p9911.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: FetchFile Cannot Allocate Enough Memory

Posted by "dale.chang13" <da...@outlook.com>.
Joe Witt wrote
> On May 4, 2016, at 8:56 AM, Joe Witt &lt;

> joe.witt@

> &gt; wrote:
> 
> Dale,
> 
> Where there is a fetch file there is usually a list file.  And while
> the symptom of memory issues is showing up in fetch file i am curious
> if the issue might actually be caused in ListFile.  How many files are
> in the directory being listed?
> 
> Mark,
> 
> Are we using a stream friendly API to list files and do we know if
> that API on all platforms really doing things in a stream friendly
> way?
> 
> Thanks
> Joe

So I will explain my flow first and then I will answer your question of how
I am using ListFile and FetchFile.

To begin my process, I am ingesting a CSV file that contains a list of
filenames. The first (and only ListFile) starts off the flow and passes it
to the first FetchFile to retrieve the contents of the documents. Afterward,
I use expression language (ExtractText) to extract all of the file names and
put them as attributes to individual FlowFiles. THEN I use a second
FetchFile (this is the processor that has trouble allocating memory) and use
expression language to use that file name to retrieve a text document.

The CSV file (189 MB) contains metadata and path/filenames for over 200,000
documents, and I am having trouble reading from a directory of about 85,000
documents (second FetchFile, each document is usually a few KB). I get stuck
at around 20 MB and then NiFi moves to a crawl.

I can give you a picture of our actual flow if you need it


Mark Payne wrote
> ListFile performs a listing using Java's File.listFiles(). This will
> provide a list of all files in the
> directory. I do not believe this to be related, though. Googling indicates
> that when this error
> occurs it is related to the ability to create a native process in order to
> interact with the file system.
> I don't think the issue is related to Java heap but rather available RAM
> on the box. How much RAM
> is actually available on the box? You mentioned IOPS - are you running in
> a virtual cloud environment?
> Using remote storage such as Amazon EBS?

I am running six Linux VMs on a Windows 8 machine. Three VMs (one ncm, two
nodes) use NiFi and those VMs have 20 GB assigned to them. Looking through
Ambari and monitoring the memory on the nodes, I have a little more than 4
GB free RAM on the nodes. It looks like the free memory dipped severely
during my NiFi flow, but no swap memory was used.



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720p9911.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: FetchFile Cannot Allocate Enough Memory

Posted by Mark Payne <ma...@hotmail.com>.
ListFile performs a listing using Java's File.listFiles(). This will provide a list of all files in the
directory. I do not believe this to be related, though. Googling indicates that when this error
occurs it is related to the ability to create a native process in order to interact with the file system.
I don't think the issue is related to Java heap but rather available RAM on the box. How much RAM
is actually available on the box? You mentioned IOPS - are you running in a virtual cloud environment?
Using remote storage such as Amazon EBS?

> On May 4, 2016, at 8:56 AM, Joe Witt <jo...@gmail.com> wrote:
> 
> Dale,
> 
> Where there is a fetch file there is usually a list file.  And while
> the symptom of memory issues is showing up in fetch file i am curious
> if the issue might actually be caused in ListFile.  How many files are
> in the directory being listed?
> 
> Mark,
> 
> Are we using a stream friendly API to list files and do we know if
> that API on all platforms really doing things in a stream friendly
> way?
> 
> THanks
> Joe
> 
> On Wed, May 4, 2016 at 7:37 AM, dale.chang13 <da...@outlook.com> wrote:
>> So I still haven't decrypted this problem, and I am assuming that this is an
>> IOPS problem instead of a RAM issue.
>> 
>> I have monitored the memory of the nodes in my cluster during the flow,
>> before and after the "cannot allocate memory" exception occurs. However,
>> there is no memory leak because the memory used by the JVM remains steady
>> between 50 and 100 MB used using jconsole. As a note, I have allocated 1 GB
>> as a minimum and 4 GB as a maximum for the heap size for each node.
>> 
>> There are also no changes to the number of active threads (35) in jconsole
>> while the NiFi gui shows up to 20 active threads. Additionally the number of
>> classes loaded and CPU usage remains the same throughout the whole NiFi
>> operation.
>> 
>> The only difference I have seen is disk activity on the drive that is
>> configured to read to/write from NiFi.
>> 
>> My question is: does it make sense that this is an IO issue, or a RAM/memory
>> issue?
>> 
>> 
>> 
>> --
>> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720p9901.html
>> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: FetchFile Cannot Allocate Enough Memory

Posted by Joe Witt <jo...@gmail.com>.
Dale,

Where there is a fetch file there is usually a list file.  And while
the symptom of memory issues is showing up in fetch file i am curious
if the issue might actually be caused in ListFile.  How many files are
in the directory being listed?

Mark,

Are we using a stream friendly API to list files and do we know if
that API on all platforms really doing things in a stream friendly
way?

THanks
Joe

On Wed, May 4, 2016 at 7:37 AM, dale.chang13 <da...@outlook.com> wrote:
> So I still haven't decrypted this problem, and I am assuming that this is an
> IOPS problem instead of a RAM issue.
>
> I have monitored the memory of the nodes in my cluster during the flow,
> before and after the "cannot allocate memory" exception occurs. However,
> there is no memory leak because the memory used by the JVM remains steady
> between 50 and 100 MB used using jconsole. As a note, I have allocated 1 GB
> as a minimum and 4 GB as a maximum for the heap size for each node.
>
> There are also no changes to the number of active threads (35) in jconsole
> while the NiFi gui shows up to 20 active threads. Additionally the number of
> classes loaded and CPU usage remains the same throughout the whole NiFi
> operation.
>
> The only difference I have seen is disk activity on the drive that is
> configured to read to/write from NiFi.
>
> My question is: does it make sense that this is an IO issue, or a RAM/memory
> issue?
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720p9901.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: FetchFile Cannot Allocate Enough Memory

Posted by "dale.chang13" <da...@outlook.com>.
So I still haven't decrypted this problem, and I am assuming that this is an
IOPS problem instead of a RAM issue.

I have monitored the memory of the nodes in my cluster during the flow,
before and after the "cannot allocate memory" exception occurs. However,
there is no memory leak because the memory used by the JVM remains steady
between 50 and 100 MB used using jconsole. As a note, I have allocated 1 GB
as a minimum and 4 GB as a maximum for the heap size for each node. 

There are also no changes to the number of active threads (35) in jconsole
while the NiFi gui shows up to 20 active threads. Additionally the number of
classes loaded and CPU usage remains the same throughout the whole NiFi
operation.

The only difference I have seen is disk activity on the drive that is
configured to read to/write from NiFi.

My question is: does it make sense that this is an IO issue, or a RAM/memory
issue?



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720p9901.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: FetchFile Cannot Allocate Enough Memory

Posted by "dale.chang13" <da...@outlook.com>.
Mark Payne wrote
> Some googling of "FileNotFoundException cannot allocate memory" indicates
> that this is
> fairly common when running in a VM that has very little RAM, as there is
> not enough heap
> space even to create a linux process. Do you have a reasonable amount of
> RAM free on the
> box?
> 
> Thanks
> -Mark

Each node is configured for 20 GB of memory, and the bootstrap.conf for each
node specifies the JVM heap size to be at a minimum of 8g and a max of 10g:
-Xms8g
-Xmx10g

Checking my Hyper V Manager, the activity of the boxes are fairly low--less
than 10% of the 20GB I gave each of them.

I still have the same stack trace pop up.



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720p9728.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: FetchFile Cannot Allocate Enough Memory

Posted by Mark Payne <ma...@hotmail.com>.
Some googling of "FileNotFoundException cannot allocate memory" indicates that this is
fairly common when running in a VM that has very little RAM, as there is not enough heap
space even to create a linux process. Do you have a reasonable amount of RAM free on the
box?

Thanks
-Mark

> On Apr 29, 2016, at 9:29 AM, dale.chang13 <da...@outlook.com> wrote:
> 
> Mark Payne wrote
>> Dale,
>> 
>> I haven't seen this issue personally. I don't believe it has to do with
>> content/flowfile
>> repo space. Can you check the logs/nifi-app.log file and give us the exact
>> error message
>> from the logs, with the stack trace if it is provided?
>> 
>> Thanks
>> -Mark
> 
> Sure, this is from one of the slave nodes. I hardly provides any
> information. I supposed I could do a jstat or a df -h.
> 
> I've also created a MonitorMemory Reporting Task, but I cannot seem to
> provide the correct names for the memory pool. The only one that works fine
> is G1 Old Gen memory pool
> 
> app.log wrote
>> 2016-04-29 10:16:28,027 ERROR [Timer-Driven Process Thread-6]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188512.txt from
>> file system for
>> StandardFlowFileRecord[uuid=1a2d7918-377e-4256-8610-1b12493eb16e,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=47518,
>> length=1268],offset=0,name=FW: FERC Daily News.msg,size=1268] due to
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188512.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188512.txt
>> (Cannot allocate memory)
>> 
>> 2016-04-29 10:16:28,028 ERROR [Timer-Driven Process Thread-6]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188518.txt from
>> file system for
>> StandardFlowFileRecord[uuid=3b5fef42-2ded-47cc-aba2-6caf95f04977,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=48786,
>> length=1640],offset=0,name=FW: FERC Docket No. EL01-47:  Removing
>> Obstacles To Increased Eleu0020ctri c Generation And Natural Gas Supply In
>> The Western United States.msg,size=1640] due to
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188518.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188518.txt
>> (Cannot allocate memory)
>> 
>> 2016-04-29 10:16:28,029 ERROR [Timer-Driven Process Thread-8]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188526.txt from
>> file system for
>> StandardFlowFileRecord[uuid=71715448-2acd-4f5c-af57-9209461fe62e,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=50426,
>> length=1272],offset=0,name=FW: workshop notice.msg,size=1272] due to
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188526.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188526.txt
>> (Cannot allocate memory)
>> 
>> 2016-04-29 10:16:28,030 ERROR [Timer-Driven Process Thread-9]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188534.txt from
>> file system for
>> StandardFlowFileRecord[uuid=0bf8666b-a9bc-4412-905e-6e4a2b13253d,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=51698,
>> length=1440],offset=0,name=FW: Final Report on Workshop Report to Discuss
>> Alternative Gas Inu0020dice s.msg,size=1440] due to
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188534.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188534.txt
>> (Cannot allocate memory)
>> 
>> 2016-04-29 10:16:28,034 ERROR [Timer-Driven Process Thread-3]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188558.txt from
>> file system for
>> StandardFlowFileRecord[uuid=805d4127-d86b-4b4b-a7a0-9f6f300dc13e,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=54433,
>> length=1334],offset=0,name=FW: Proposed NARUC resolution on
>> hedging.msg,size=1334] due to java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188558.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188558.txt
>> (Cannot allocate memory)
>> 
>> 2016-04-29 10:16:28,035 ERROR [Timer-Driven Process Thread-3]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188570.txt from
>> file system for
>> StandardFlowFileRecord[uuid=38ee3345-fa72-459c-92f0-7fc2d58160ba,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=55767,
>> length=1557],offset=0,name=Merchant Group Memo from Daniel
>> Allegretti.msg,size=1557] due to java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188570.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188570.txt
>> (Cannot allocate memory)
>> 
>> 2016-04-29 10:16:28,036 ERROR [Timer-Driven Process Thread-2]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188583.txt from
>> file system for
>> StandardFlowFileRecord[uuid=a6a69ead-1c1e-4e52-9cf1-536f065f4e84,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=57324,
>> length=1450],offset=0,name=Thanksgiving pictures.msg,size=1450] due to
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188583.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188583.txt
>> (Cannot allocate memory)
>> 
>> 2016-04-29 10:16:28,038 ERROR [Timer-Driven Process Thread-2]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188591.txt from
>> file system for
>> StandardFlowFileRecord[uuid=c23a560a-519f-4c34-9686-cda228fe3362,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=58774,
>> length=1420],offset=0,name=and more....msg,size=1420] due to
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188591.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188591.txt
>> (Cannot allocate memory)
>> 
>> 2016-04-29 10:16:28,043 ERROR [Timer-Driven Process Thread-2]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188601.txt from
>> file system for
>> StandardFlowFileRecord[uuid=96f51398-8b92-485c-8e8f-a791f896ce9f,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=61578,
>> length=1349],offset=0,name=pix....msg,size=1349] due to
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188601.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188601.txt
>> (Cannot allocate memory)
>> 
>> 2016-04-29 10:16:28,044 ERROR [Timer-Driven Process Thread-2]
>> o.a.nifi.processors.standard.FetchFile
>> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188605.txt from
>> file system for
>> StandardFlowFileRecord[uuid=cdafaa63-b694-46da-9103-3da70e2c23ac,claim=StandardContentClaim
>> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
>> container=default, section=463], offset=62927,
>> length=1361],offset=0,name=more of.....msg,size=1361] due to
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188605.txt
>> (Cannot allocate memory); routing to failure:
>> java.io.FileNotFoundException:
>> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188605.txt
>> (Cannot allocate memory)
> 
> 
> 
> 
> 
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720p9724.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: FetchFile Cannot Allocate Enough Memory

Posted by "dale.chang13" <da...@outlook.com>.
Mark Payne wrote
> Dale,
> 
> I haven't seen this issue personally. I don't believe it has to do with
> content/flowfile
> repo space. Can you check the logs/nifi-app.log file and give us the exact
> error message
> from the logs, with the stack trace if it is provided?
> 
> Thanks
> -Mark

Sure, this is from one of the slave nodes. I hardly provides any
information. I supposed I could do a jstat or a df -h.

I've also created a MonitorMemory Reporting Task, but I cannot seem to
provide the correct names for the memory pool. The only one that works fine
is G1 Old Gen memory pool

app.log wrote
> 2016-04-29 10:16:28,027 ERROR [Timer-Driven Process Thread-6]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188512.txt from
> file system for
> StandardFlowFileRecord[uuid=1a2d7918-377e-4256-8610-1b12493eb16e,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=47518,
> length=1268],offset=0,name=FW: FERC Daily News.msg,size=1268] due to
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188512.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188512.txt
> (Cannot allocate memory)
> 
> 2016-04-29 10:16:28,028 ERROR [Timer-Driven Process Thread-6]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188518.txt from
> file system for
> StandardFlowFileRecord[uuid=3b5fef42-2ded-47cc-aba2-6caf95f04977,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=48786,
> length=1640],offset=0,name=FW: FERC Docket No. EL01-47:  Removing
> Obstacles To Increased Eleu0020ctri c Generation And Natural Gas Supply In
> The Western United States.msg,size=1640] due to
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188518.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188518.txt
> (Cannot allocate memory)
> 
> 2016-04-29 10:16:28,029 ERROR [Timer-Driven Process Thread-8]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188526.txt from
> file system for
> StandardFlowFileRecord[uuid=71715448-2acd-4f5c-af57-9209461fe62e,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=50426,
> length=1272],offset=0,name=FW: workshop notice.msg,size=1272] due to
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188526.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188526.txt
> (Cannot allocate memory)
> 
> 2016-04-29 10:16:28,030 ERROR [Timer-Driven Process Thread-9]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188534.txt from
> file system for
> StandardFlowFileRecord[uuid=0bf8666b-a9bc-4412-905e-6e4a2b13253d,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=51698,
> length=1440],offset=0,name=FW: Final Report on Workshop Report to Discuss
> Alternative Gas Inu0020dice s.msg,size=1440] due to
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188534.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188534.txt
> (Cannot allocate memory)
> 
> 2016-04-29 10:16:28,034 ERROR [Timer-Driven Process Thread-3]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188558.txt from
> file system for
> StandardFlowFileRecord[uuid=805d4127-d86b-4b4b-a7a0-9f6f300dc13e,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=54433,
> length=1334],offset=0,name=FW: Proposed NARUC resolution on
> hedging.msg,size=1334] due to java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188558.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188558.txt
> (Cannot allocate memory)
> 
> 2016-04-29 10:16:28,035 ERROR [Timer-Driven Process Thread-3]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188570.txt from
> file system for
> StandardFlowFileRecord[uuid=38ee3345-fa72-459c-92f0-7fc2d58160ba,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=55767,
> length=1557],offset=0,name=Merchant Group Memo from Daniel
> Allegretti.msg,size=1557] due to java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188570.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188570.txt
> (Cannot allocate memory)
> 
> 2016-04-29 10:16:28,036 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188583.txt from
> file system for
> StandardFlowFileRecord[uuid=a6a69ead-1c1e-4e52-9cf1-536f065f4e84,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=57324,
> length=1450],offset=0,name=Thanksgiving pictures.msg,size=1450] due to
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188583.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188583.txt
> (Cannot allocate memory)
> 
> 2016-04-29 10:16:28,038 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188591.txt from
> file system for
> StandardFlowFileRecord[uuid=c23a560a-519f-4c34-9686-cda228fe3362,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=58774,
> length=1420],offset=0,name=and more....msg,size=1420] due to
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188591.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188591.txt
> (Cannot allocate memory)
> 
> 2016-04-29 10:16:28,043 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188601.txt from
> file system for
> StandardFlowFileRecord[uuid=96f51398-8b92-485c-8e8f-a791f896ce9f,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=61578,
> length=1349],offset=0,name=pix....msg,size=1349] due to
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188601.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188601.txt
> (Cannot allocate memory)
> 
> 2016-04-29 10:16:28,044 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.standard.FetchFile
> FetchFile[id=6c7482f2-5780-37c8-99a0-f2d87cbcbba9] Could not fetch file
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188605.txt from
> file system for
> StandardFlowFileRecord[uuid=cdafaa63-b694-46da-9103-3da70e2c23ac,claim=StandardContentClaim
> [resourceClaim=StandardResourceClaim[id=1461938937407-463,
> container=default, section=463], offset=62927,
> length=1361],offset=0,name=more of.....msg,size=1361] due to
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188605.txt
> (Cannot allocate memory); routing to failure:
> java.io.FileNotFoundException:
> /tmp/hddCobrasan/Export1/VOL000001/TEXT/TEXT000001/ENR-00188605.txt
> (Cannot allocate memory)





--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720p9724.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: FetchFile Cannot Allocate Enough Memory

Posted by Mark Payne <ma...@hotmail.com>.
Dale,

I haven't seen this issue personally. I don't believe it has to do with content/flowfile
repo space. Can you check the logs/nifi-app.log file and give us the exact error message
from the logs, with the stack trace if it is provided?

Thanks
-Mark

> On Apr 29, 2016, at 9:12 AM, dale.chang13 <da...@outlook.com> wrote:
> 
> I have been trying to run my data flow and I have been running into a problem
> with being unable to read FetchFiles. I will detail my process below and I
> would like some confirmation of my suspicions.
> 
> First I am ingesting an initial file that is fairly large, which contains
> the path/filename of a ton of text files within another directory. The goal
> is to read in the content of that large file, then read in the contents of
> the thousands of text files, and then store the text file content into Solr.
> 
> The problem I am having is that the second FetchFile, the one that reads in
> the smaller text files, frequently reports an error: /FileNotFoundException
> xxx.txt (Cannot allocate memory); routing to failure/. This FetchFile runs
> for about 20000 files and then continuously reports the above error for the
> rest of the files.
> 
> My suspicion is of two concerns: not enough heap space vs. not enough
> content_repo/flowfile_repo space. Any ideas or questions?
> 
> 
> 
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/FetchFile-Cannot-Allocate-Enough-Memory-tp9720.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.