You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by christophe heligon <ch...@univ-rennes1.fr> on 2022/10/20 11:29:52 UTC

Memory leak in fuseki? (attached file included)

Hi everyone, 

I am using Apache Jena Fuseki 4.6.1 to host 2 datasets (rdf + rdf-star). 

I have noticed an unexpected behaviour on my server as my RAM gets filled little by little over time ( 100 Mo / few minutes can be several Go at the end) when no queries are received by the server. When I submit a query using fuseki GUI the RAM gets freed (at least partially) from what seemed to have accumulated into it in between queries. 

The java process is using that memory see attached file. Note that I use the standard configuration provided by the server except for the max memory that I set to 20 Go. The java process is largely exceeding this limit as it may reach over 50 Go. 

Is it a known issue of Fuseki? Java? 
Is there a way to monitor that better than just record memory use? 

Any fix published or hint on how to solve that? 

Best regards, 
Christophe 

-- 

	

	Christophe Héligon 


Institut de Génétique & Développement de Rennes 

Equipe Ingénierie Inverse de la Division Cellulaire (CeDRE) 



	[ mailto:christophe.heligon@univ-rennes1.fr | christophe.heligon@univ-rennes1.fr ] 

	[ https://igdr.univ-rennes1.fr/ | https://igdr.univ-rennes1.fr ] 

	UMR 6290 CNRS - UR1, ERL Inserm U1305 
Campus Santé de Villejean, 2 avenue du Professeur Léon Bernard, 
CS 34317 / 35043 Rennes Cedex, France 



Re: Memory leak in fuseki? (attached file included)

Posted by Andy Seaborne <an...@apache.org>.
Hi Christophe,

This behaviour is to be expected. It is a sign there is a lot of unused 
memory and the file system cache has used it.  All unused memory (Linux, 
Mac, Windows) is used by teh OS for the file system cache automatically.

 > Is there a way to monitor that better than just record memory use?

VisualVM shows the heap and also allows you to force a garbage 
collection as well as look at the heap.


There is no need to set the heap as high as 20G - in fact, it will slow 
the server down!


The figure top(1) shows is the total virtual memory (VIRT) and resident 
memory (RES) for the whole OS process.

For Fuseki (TDB2) this is not the heap size (which is why the process 
size VIRT is showing larger than the heap).

TDB2 uses memory mapped files. These files are in the OS file system 
cache and become part of ("mapped") the process's virtual memory. The OS 
manages which areas of the file are really in-memory and which aren't.

The OS will grow the file system cache to use all available memory for 
resident segments of files. It will automatically shrink the resident 
space if there is demand from other processes. But the files are still 
in the virtual memory space of the process.

So the virtual space becomes the entire space for the index files but 
not all of the files are in real memory. Top(1) includes the siz eof 
files touched.

If you want to see the heap, use VisualVM.


The trouble with a big heap is that Java will grow the heap while doing 
lightweight garbage collections, but not do a full GC until it gets 
close to the max heap size. Only a full GC frees up all unused memory, 
the lightweight GC's balance reclaiming and low performance impact with 
the effect of not reclaiming everything. You will see a slowly rising 
saw-tooth in VisualVM,then a big drop as a full GC cuts in.

A smaller heap stops the JVM delaying the full GC and makes the 
throughput impact of a the full GC less.

But a growing heap is squeezing out the resident parts of the virtual 
memory from the indexes files. The OS does not know some space is 
(probably) unused.

There is then less filesystem cache memory mapped files means more I/O 
to manage virtual vs resident which means Fuseki is slower.

     Andy

On 20/10/2022 12:29, christophe heligon wrote:
> Hi everyone,
> 
> I am using Apache Jena Fuseki 4.6.1 to host 2 datasets (rdf + rdf-star).
> 
> I have noticed an unexpected behaviour on my server as my RAM gets 
> filled little by little over time ( 100 Mo / few minutes can be several 
> Go at the end) when no queries are received by the server. When I submit 
> a query using fuseki GUI the RAM gets freed (at least partially) from 
> what seemed to have accumulated into it in between queries.
> 
> The java process is using that memory see attached file. Note that I use 
> the standard configuration provided by the server except for the max 
> memory that I set to 20 Go. The java process is largely exceeding this 
> limit as it may reach over 50 Go.
> 
> Is it a known issue of Fuseki? Java?
> Is there a way to monitor that better than just record memory use?
> 
> Any fix published or hint on how to solve that?
> 
> Best regards,
> Christophe
> 
> -- 
> 
> 	
> 
> 	
> 
> 
>       ChristopheHéligon
> 
> Institut de Génétique & Développement de Rennes
> 
> Equipe Ingénierie Inverse de la Division Cellulaire (CeDRE)
> 
> 
> 
> 	christophe.heligon@univ-rennes1.fr 
> <ma...@univ-rennes1.fr>
> 
> 	https://igdr.univ-rennes1.fr <https://igdr.univ-rennes1.fr>
> 
> 	UMR 6290 CNRS - UR1, ERL Inserm U1305
> Campus Santé de Villejean, 2 avenue du Professeur Léon Bernard,
> CS 34317 / 35043 Rennes Cedex, France
> 
> 
> 

Re: Memory leak in fuseki? (attached file included)

Posted by Andy Seaborne <an...@apache.org>.

On 21/10/2022 13:42, Theodore.Hills@morganstanley.com wrote:
> Hello Christophe,
> 
> I am relatively new to Fuseki, so take my feedback with a grain of salt.
> 
> I recall reading that Fuseki accumulates updates in memory while readers are active, and that only when there are no readers does it apply accumulated updates. This means that if you have a multi-threaded application with frequent reads and writes, you will likely gradually fill up memory until there’s a pause in reading and there’s a chance to write out the accumulated updates. I experienced this with an application I was running with 30 threads in parallel. It filled up my workstation’s memory, and my workstation crashed. I reduced the number of concurrent threads to 1, and the problem did not recur.

This is true but only for TDB1 and only if a sequence of overlapping 
requests happens during and continues after a writer. (If it had enough 
memory it gets quite pushy about letting the writer post-commit 
finalization happen.)

This is only of the reasons for TDB2.  It does not accumulate updates in 
memory - in fact it writes changes back during the transaction.

     Andy

> 
> You could try reducing the number of threads, or try pausing all readers—if that’s possible—to give Fuseki a chance to write out accumulated updates.
> 
> All the best
> Ted
> 
> Theodore Hills
> Consultant | Research
> Phone: +1 212 296-1833
> Theodore.Hills@morganstanley.com<ma...@morganstanley.com>
> From: christophe heligon <ch...@univ-rennes1.fr>
> Sent: Thursday, October 20, 2022 7:30 AM
> To: users <us...@jena.apache.org>
> Subject: Memory leak in fuseki? (attached file included)
> 
> Hi everyone,
> 
> I am using Apache Jena Fuseki 4.6.1 to host 2 datasets (rdf + rdf-star).
> 
> I have noticed an unexpected behaviour on my server as my RAM gets filled little by little over time ( 100 Mo / few minutes can be several Go at the end) when no queries are received by the server. When I submit a query using fuseki GUI the RAM gets freed (at least partially) from what seemed to have accumulated into it in between queries.
> 
> The java process is using that memory see attached file. Note that I use the standard configuration provided by the server except for the max memory that I set to 20 Go. The java process is largely exceeding this limit as it may reach over 50 Go.
> 
> Is it a known issue of Fuseki? Java?
> Is there a way to monitor that better than just record memory use?
> 
> Any fix published or hint on how to solve that?
> 
> Best regards,
> Christophe
> 
> --
> 
> [https://igdr.univ-rennes1.fr/sites/igdr.univ-rennes1.fr/files/medias/images/logoIGDR.jpg]
> 
> 
> Christophe Héligon
> 
> Institut de Génétique & Développement de Rennes
> 
> Equipe Ingénierie Inverse de la Division Cellulaire (CeDRE)
> 
> 
> 
> [https://igdr.univ-rennes1.fr/sites/igdr.univ-rennes1.fr/files/medias/images/email-icon-2x.png]
> 
> 
> christophe.heligon@univ-rennes1.fr<ma...@univ-rennes1.fr>
> 
> [https://igdr.univ-rennes1.fr/sites/igdr.univ-rennes1.fr/files/medias/images/link-icon-2x.png]
> 
> 
> https://igdr.univ-rennes1.fr
> 
> [https://igdr.univ-rennes1.fr/sites/igdr.univ-rennes1.fr/files/medias/images/address-icon-2x.png]
> 
> 
> UMR 6290 CNRS - UR1, ERL Inserm U1305
> Campus Santé de Villejean, 2 avenue du Professeur Léon Bernard,
> CS 34317 / 35043 Rennes Cedex, France
> 
> 
> 
> 
> 
> 
> 
> ________________________________
> 
> For important information including analyst certification and disclosures regarding specific companies, derivatives, or other instruments discussed in this email, please refer to the latest report, if attached and/or hyperlinked to this email, or by logging on to the Morgan Stanley Matrix Platform at http://matrix.ms.com/eqr/research . You may also refer to the Morgan Stanley Research Disclosure Website at http://www.morganstanley.com/eqr/disclosures/webapp/coverage. Morgan Stanley will make certain research products and announcements available only on the Morgan Stanley Matrix Platform.  The content provided in this email, including data or any attachments, is subject to the terms and conditions of use (available at https://ny.matrix.ms.com/matrix/portal/docs/terms/index.html#/terms/general ) applicable to research materials accessed through Matrix, including the terms regarding confidentiality and intellectual property rights. For access to the Morgan Stanley Matrix Platform please contact your sales representative or go to Matrix.
> 
> NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.
> 
> You may have certain rights regarding the information that Morgan Stanley collects about you. Please see our Privacy Pledge https://www.morganstanley.com/privacy-pledge for more information about your rights.

RE: Memory leak in fuseki? (attached file included)

Posted by "Theodore.Hills@morganstanley.com" <Th...@morganstanley.com>.
Hello Christophe,

I am relatively new to Fuseki, so take my feedback with a grain of salt.

I recall reading that Fuseki accumulates updates in memory while readers are active, and that only when there are no readers does it apply accumulated updates. This means that if you have a multi-threaded application with frequent reads and writes, you will likely gradually fill up memory until there’s a pause in reading and there’s a chance to write out the accumulated updates. I experienced this with an application I was running with 30 threads in parallel. It filled up my workstation’s memory, and my workstation crashed. I reduced the number of concurrent threads to 1, and the problem did not recur.

You could try reducing the number of threads, or try pausing all readers—if that’s possible—to give Fuseki a chance to write out accumulated updates.

All the best
Ted

Theodore Hills
Consultant | Research
Phone: +1 212 296-1833
Theodore.Hills@morganstanley.com<ma...@morganstanley.com>
From: christophe heligon <ch...@univ-rennes1.fr>
Sent: Thursday, October 20, 2022 7:30 AM
To: users <us...@jena.apache.org>
Subject: Memory leak in fuseki? (attached file included)

Hi everyone,

I am using Apache Jena Fuseki 4.6.1 to host 2 datasets (rdf + rdf-star).

I have noticed an unexpected behaviour on my server as my RAM gets filled little by little over time ( 100 Mo / few minutes can be several Go at the end) when no queries are received by the server. When I submit a query using fuseki GUI the RAM gets freed (at least partially) from what seemed to have accumulated into it in between queries.

The java process is using that memory see attached file. Note that I use the standard configuration provided by the server except for the max memory that I set to 20 Go. The java process is largely exceeding this limit as it may reach over 50 Go.

Is it a known issue of Fuseki? Java?
Is there a way to monitor that better than just record memory use?

Any fix published or hint on how to solve that?

Best regards,
Christophe

--

[https://igdr.univ-rennes1.fr/sites/igdr.univ-rennes1.fr/files/medias/images/logoIGDR.jpg]


Christophe Héligon

Institut de Génétique & Développement de Rennes

Equipe Ingénierie Inverse de la Division Cellulaire (CeDRE)



[https://igdr.univ-rennes1.fr/sites/igdr.univ-rennes1.fr/files/medias/images/email-icon-2x.png]


christophe.heligon@univ-rennes1.fr<ma...@univ-rennes1.fr>

[https://igdr.univ-rennes1.fr/sites/igdr.univ-rennes1.fr/files/medias/images/link-icon-2x.png]


https://igdr.univ-rennes1.fr

[https://igdr.univ-rennes1.fr/sites/igdr.univ-rennes1.fr/files/medias/images/address-icon-2x.png]


UMR 6290 CNRS - UR1, ERL Inserm U1305
Campus Santé de Villejean, 2 avenue du Professeur Léon Bernard,
CS 34317 / 35043 Rennes Cedex, France







________________________________

For important information including analyst certification and disclosures regarding specific companies, derivatives, or other instruments discussed in this email, please refer to the latest report, if attached and/or hyperlinked to this email, or by logging on to the Morgan Stanley Matrix Platform at http://matrix.ms.com/eqr/research . You may also refer to the Morgan Stanley Research Disclosure Website at http://www.morganstanley.com/eqr/disclosures/webapp/coverage. Morgan Stanley will make certain research products and announcements available only on the Morgan Stanley Matrix Platform.  The content provided in this email, including data or any attachments, is subject to the terms and conditions of use (available at https://ny.matrix.ms.com/matrix/portal/docs/terms/index.html#/terms/general ) applicable to research materials accessed through Matrix, including the terms regarding confidentiality and intellectual property rights. For access to the Morgan Stanley Matrix Platform please contact your sales representative or go to Matrix.

NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent permitted under applicable law, to monitor electronic communications. This message is subject to terms available at the following link: http://www.morganstanley.com/disclaimers If you cannot access these links, please notify us by reply message and we will send the contents to you. By messaging with Morgan Stanley you consent to the foregoing.

You may have certain rights regarding the information that Morgan Stanley collects about you. Please see our Privacy Pledge https://www.morganstanley.com/privacy-pledge for more information about your rights.