You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Mikael Pesonen <mi...@lingsoft.fi> on 2019/02/05 11:06:45 UTC

Re: Out of memory

Tested with 16GB, and java mem usage goes up to 10G (virt 14G). 
Wondering what does the java -Xms do actually...

There was no way to limit mem usage for 8GB server?


On 29/01/2019 19:41, Dan Pritts wrote:
> It's often misunderstood, but Java programs use memory in addition to the
> configured heap.  Fuseki in my experience sometimes uses a LOT more, more
> than I could explain.  Some of the folks here (Andy for sure) spent some
> time looking at it with me and weren't able to come to any conclusions.
> You can look throught he list archives for the discussion, maybe 6 months
> ago.
>
> I ended up significantly overallocating memory to the instance and being
> done with it.
>
> How much RAM does your instance have?  You mentioned -Xmx 5600, and total
> usage of 17GB ram+swap - sounds like you have maybe 8GB ram?    I'd try
> 16GB and see how it does; watch the total memory usage.
>
>
>
> On Tue, Jan 29, 2019 at 9:43 AM Mikael Pesonen <mi...@lingsoft.fi>
> wrote:
>
>>
>>
>> On 29/01/2019 16:28, Rob Vesse wrote:
>>> This may be partly a case of a simple looking query having unexpected
>> execution semantics.  Strictly speaking your query says select all triples
>> in the specific graph then join them with these list of values for ?s.  Now
>> the optimiser should, and does appear, to do the right thing and flip the
>> join order i.e. it uses the concrete values from the VALUES block to search
>> for triples with those subjects in the specific graph.  However if the
>> query had other elements involved the optimiser might not kick in, a better
>> query would place the VALUES prior to using the variables defined in the
>> VALUES block.
>> Thanks for the reminder on VALUES order
>>> This sounds like memory/cache thrashing.  From what you have described,
>> running variants on this query 50k times, you are basically walking over
>> your entire dataset extracting it piece by piece?
>> Dataset is larger, these small sets (VALUES) are coming from out
>> external index for similar document search. Index returns id and related
>> metadata is fetched from Jena.
>>> Assuming the Graph URI and the URIs in your VALUES block change in each
>> query then every query is looking at a different section of the database
>> causing a lot of data to be cached and then evicted both in terms of
>> on-heap memory structures (the node table cache) and potentially also for
>> the off heap memory mapped files which may be being paged in and out as the
>> code traverses the B-Tree indexes.
>>> Is there also some other query involved that extracts the Graph URIs and
>> Subject URIs of interest that is being executed in parallel with the
>> script?  Or has the input from the script been pre-calculated ahead of
>> time, comes from elsewhere etc?
>> There is no parrallelism from our part in this case. Only one php script
>> running and making GSP calls.
>>> Rob
>>>
>>> On 29/01/2019, 14:06, "Mikael Pesonen" <mi...@lingsoft.fi>
>> wrote:
>>>
>>>       Server:
>>>
>>>       /usr/bin/java
>>>
>> -Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties
>>>       -Xmx5600M -jar fuseki-server.jar --update --port 3030
>>>       --loc=/home/text/tools/jena_data_test/ /ds
>>>
>>>       No custom configs, default installation package.
>>>
>>>
>>>       Sparql similar to this (returns 5-10 triplets) :
>>>
>>>       CONSTRUCT { ?s ?p ?o }
>>>       FROM <
>> https://resource.lingsoft.fi/4f13c609-48b4-4e4d-a40b-2d7946f88234/>
>>>       WHERE
>>>       {
>>>                ?s ?p ?o
>>>
>>>       VALUES ?s {lsr:10609f75-5cf3-4544-8fc1-c361778c3bd8
>>>       lsr:88d0bb8c-35d8-4051-a27d-a0d93af77985
>>>       lsr:fc7b2c65-453e-469b-9c5d-8c7ee4ee6902
>>>       lsr:239c6da0-4c24-4539-a277-c9756d6257ee
>>>       lsr:2ef0190d-6271-447a-992f-6225fc440897
>>>       lsr:6aaf601c-ccf4-4e59-9757-1a463db49fa9
>>>       lsr:d7c9dc96-cd61-4a31-b466-bb2491a3ceaf
>>>       lsr:6f6802cf-0336-4234-90b8-cc8780058f0d
>>>       lsr:d1e2751b-4332-4d57-95e4-ca8070c16782
>>>       lsr:81053775-4722-4a00-b3f7-33d4feb3629b}
>>>       }
>>>
>>>
>>>       I solved this by adding sleep to script. So I guess it's about the
>> java
>>>       memory manager not getting time to free memory? Even with sleep it
>> was
>>>       barely doable, memory consumption changing rapidly between 1,5 gig
>> - 6 gig.
>>>
>>>
>>>       On 29/01/2019 15:50, Andy Seaborne wrote:
>>>       > Mikael,
>>>       >
>>>       > There aren't enough details except to mention the suspects like
>> sorting.
>>>       >
>>>       > With all the questions on the list, I personally don't track the
>>>       > details of each installation so please also remind me of your
>> current
>>>       > setup.
>>>       >
>>>       >     Andy
>>>       >
>>>       > On 29/01/2019 11:32, Mikael Pesonen wrote:
>>>       >>
>>>       >> I'm not able to run a basic read-only script without running out
>> of
>>>       >> memory on the server.
>>>       >>
>>>       >> Consumption goes to 7+gigs (VM 10+ gigs), then system kills
>> Fuseki
>>>       >> when running out of memory.
>>>       >> All I'm running is simple sparql query getting few triples of
>>>       >> resource. This is run for about 50k times.
>>>       >>
>>>       >> All settings are default, using GSP.
>>>       >>
>>>       >>
>>>
>>>       --
>>>       Lingsoft - 30 years of Leading Language Management
>>>
>>>       www.lingsoft.fi
>>>
>>>       Speech Applications - Language Management - Translation - Reader's
>> and Writer's Tools - Text Tools - E-books and M-books
>>>       Mikael Pesonen
>>>       System Engineer
>>>
>>>       e-mail: mikael.pesonen@lingsoft.fi
>>>       Tel. +358 2 279 3300
>>>
>>>       Time zone: GMT+2
>>>
>>>       Helsinki Office
>>>       Eteläranta 10
>>>       FI-00130 Helsinki
>>>       FINLAND
>>>
>>>       Turku Office
>>>       Kauppiaskatu 5 A
>>>       FI-20100 Turku
>>>       FINLAND
>>>
>>>
>>>
>>>
>>>
>>>
>> --
>> Lingsoft - 30 years of Leading Language Management
>>
>> www.lingsoft.fi
>>
>> Speech Applications - Language Management - Translation - Reader's and
>> Writer's Tools - Text Tools - E-books and M-books
>>
>> Mikael Pesonen
>> System Engineer
>>
>> e-mail: mikael.pesonen@lingsoft.fi
>> Tel. +358 2 279 3300
>>
>> Time zone: GMT+2
>>
>> Helsinki Office
>> Eteläranta 10
>> FI-00130 Helsinki
>> FINLAND
>>
>> Turku Office
>> Kauppiaskatu 5 A
>> FI-20100 Turku
>> FINLAND
>>
>>

-- 
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: Out of memory

Posted by Mikael Pesonen <mi...@lingsoft.fi>.
Sorry meant -Xmx. Debugging Fuseki is out of my work scope, but good to 
know that 16GB seems to do the trick. So maybe deploy more cases on same 
bigger server instead of splitting them on smaller ones.

On 05/02/2019 17:48, Rob Vesse wrote:
> And I realise browsing back through the thread that you mentioned that you don't have a desktop in a previous reply.  So I presume you mean you only have terminal access to the machine where you are running Fuseki?
>
> In which case you might want to try out jvmtop - https://github.com/patric-r/jvmtop - as an open source command line based JVM profiler
>
> Rob
>
> On 05/02/2019, 15:44, "Rob Vesse" <rv...@dotnetrdf.org> wrote:
>
>      -Xms and -Xmx do two different things (the previous email in the thread mentioned -Xmx but then you referenced -Xms in your question).
>      
>      The former sets the minimum heap size which is the minimum amount of memory the JVM will allocate for the heap when it starts
>      
>      The latter sets the maximum heap size which is the maximum amount of memory the JVM will allocate for the heap during runtime.  The heap may start smaller than this and grow up to this maximum
>      
>      When one/both of these is not set your JVM chooses default values, usually based upon some percentage of the system memory.  Exact behaviour will vary between JVMs.
>      
>      As I think has been suggested earlier in this thread if you are continuing to have issues with memory consumption your best bet to investigate further is to attach a JVM profiler to the running Fuseki process.  With that you can take Snapshots of the memory usage over time and inspect them to see where the memory consumption is going.
>      
>      Visual VM - https://visualvm.github.io - is one such free tool, there are of course other free and proprietary JVM profilers available.
>      
>      Rob
>      
>      On 05/02/2019, 11:07, "Mikael Pesonen" <mi...@lingsoft.fi> wrote:
>      
>          
>          Tested with 16GB, and java mem usage goes up to 10G (virt 14G).
>          Wondering what does the java -Xms do actually...
>          
>          There was no way to limit mem usage for 8GB server?
>          
>          
>          On 29/01/2019 19:41, Dan Pritts wrote:
>          > It's often misunderstood, but Java programs use memory in addition to the
>          > configured heap.  Fuseki in my experience sometimes uses a LOT more, more
>          > than I could explain.  Some of the folks here (Andy for sure) spent some
>          > time looking at it with me and weren't able to come to any conclusions.
>          > You can look throught he list archives for the discussion, maybe 6 months
>          > ago.
>          >
>          > I ended up significantly overallocating memory to the instance and being
>          > done with it.
>          >
>          > How much RAM does your instance have?  You mentioned -Xmx 5600, and total
>          > usage of 17GB ram+swap - sounds like you have maybe 8GB ram?    I'd try
>          > 16GB and see how it does; watch the total memory usage.
>          >
>          >
>          >
>          > On Tue, Jan 29, 2019 at 9:43 AM Mikael Pesonen <mi...@lingsoft.fi>
>          > wrote:
>          >
>          >>
>          >>
>          >> On 29/01/2019 16:28, Rob Vesse wrote:
>          >>> This may be partly a case of a simple looking query having unexpected
>          >> execution semantics.  Strictly speaking your query says select all triples
>          >> in the specific graph then join them with these list of values for ?s.  Now
>          >> the optimiser should, and does appear, to do the right thing and flip the
>          >> join order i.e. it uses the concrete values from the VALUES block to search
>          >> for triples with those subjects in the specific graph.  However if the
>          >> query had other elements involved the optimiser might not kick in, a better
>          >> query would place the VALUES prior to using the variables defined in the
>          >> VALUES block.
>          >> Thanks for the reminder on VALUES order
>          >>> This sounds like memory/cache thrashing.  From what you have described,
>          >> running variants on this query 50k times, you are basically walking over
>          >> your entire dataset extracting it piece by piece?
>          >> Dataset is larger, these small sets (VALUES) are coming from out
>          >> external index for similar document search. Index returns id and related
>          >> metadata is fetched from Jena.
>          >>> Assuming the Graph URI and the URIs in your VALUES block change in each
>          >> query then every query is looking at a different section of the database
>          >> causing a lot of data to be cached and then evicted both in terms of
>          >> on-heap memory structures (the node table cache) and potentially also for
>          >> the off heap memory mapped files which may be being paged in and out as the
>          >> code traverses the B-Tree indexes.
>          >>> Is there also some other query involved that extracts the Graph URIs and
>          >> Subject URIs of interest that is being executed in parallel with the
>          >> script?  Or has the input from the script been pre-calculated ahead of
>          >> time, comes from elsewhere etc?
>          >> There is no parrallelism from our part in this case. Only one php script
>          >> running and making GSP calls.
>          >>> Rob
>          >>>
>          >>> On 29/01/2019, 14:06, "Mikael Pesonen" <mi...@lingsoft.fi>
>          >> wrote:
>          >>>
>          >>>       Server:
>          >>>
>          >>>       /usr/bin/java
>          >>>
>          >> -Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties
>          >>>       -Xmx5600M -jar fuseki-server.jar --update --port 3030
>          >>>       --loc=/home/text/tools/jena_data_test/ /ds
>          >>>
>          >>>       No custom configs, default installation package.
>          >>>
>          >>>
>          >>>       Sparql similar to this (returns 5-10 triplets) :
>          >>>
>          >>>       CONSTRUCT { ?s ?p ?o }
>          >>>       FROM <
>          >> https://resource.lingsoft.fi/4f13c609-48b4-4e4d-a40b-2d7946f88234/>
>          >>>       WHERE
>          >>>       {
>          >>>                ?s ?p ?o
>          >>>
>          >>>       VALUES ?s {lsr:10609f75-5cf3-4544-8fc1-c361778c3bd8
>          >>>       lsr:88d0bb8c-35d8-4051-a27d-a0d93af77985
>          >>>       lsr:fc7b2c65-453e-469b-9c5d-8c7ee4ee6902
>          >>>       lsr:239c6da0-4c24-4539-a277-c9756d6257ee
>          >>>       lsr:2ef0190d-6271-447a-992f-6225fc440897
>          >>>       lsr:6aaf601c-ccf4-4e59-9757-1a463db49fa9
>          >>>       lsr:d7c9dc96-cd61-4a31-b466-bb2491a3ceaf
>          >>>       lsr:6f6802cf-0336-4234-90b8-cc8780058f0d
>          >>>       lsr:d1e2751b-4332-4d57-95e4-ca8070c16782
>          >>>       lsr:81053775-4722-4a00-b3f7-33d4feb3629b}
>          >>>       }
>          >>>
>          >>>
>          >>>       I solved this by adding sleep to script. So I guess it's about the
>          >> java
>          >>>       memory manager not getting time to free memory? Even with sleep it
>          >> was
>          >>>       barely doable, memory consumption changing rapidly between 1,5 gig
>          >> - 6 gig.
>          >>>
>          >>>
>          >>>       On 29/01/2019 15:50, Andy Seaborne wrote:
>          >>>       > Mikael,
>          >>>       >
>          >>>       > There aren't enough details except to mention the suspects like
>          >> sorting.
>          >>>       >
>          >>>       > With all the questions on the list, I personally don't track the
>          >>>       > details of each installation so please also remind me of your
>          >> current
>          >>>       > setup.
>          >>>       >
>          >>>       >     Andy
>          >>>       >
>          >>>       > On 29/01/2019 11:32, Mikael Pesonen wrote:
>          >>>       >>
>          >>>       >> I'm not able to run a basic read-only script without running out
>          >> of
>          >>>       >> memory on the server.
>          >>>       >>
>          >>>       >> Consumption goes to 7+gigs (VM 10+ gigs), then system kills
>          >> Fuseki
>          >>>       >> when running out of memory.
>          >>>       >> All I'm running is simple sparql query getting few triples of
>          >>>       >> resource. This is run for about 50k times.
>          >>>       >>
>          >>>       >> All settings are default, using GSP.
>          >>>       >>
>          >>>       >>
>          >>>
>          >>>       --
>          >>>       Lingsoft - 30 years of Leading Language Management
>          >>>
>          >>>       www.lingsoft.fi
>          >>>
>          >>>       Speech Applications - Language Management - Translation - Reader's
>          >> and Writer's Tools - Text Tools - E-books and M-books
>          >>>       Mikael Pesonen
>          >>>       System Engineer
>          >>>
>          >>>       e-mail: mikael.pesonen@lingsoft.fi
>          >>>       Tel. +358 2 279 3300
>          >>>
>          >>>       Time zone: GMT+2
>          >>>
>          >>>       Helsinki Office
>          >>>       Eteläranta 10
>          >>>       FI-00130 Helsinki
>          >>>       FINLAND
>          >>>
>          >>>       Turku Office
>          >>>       Kauppiaskatu 5 A
>          >>>       FI-20100 Turku
>          >>>       FINLAND
>          >>>
>          >>>
>          >>>
>          >>>
>          >>>
>          >>>
>          >> --
>          >> Lingsoft - 30 years of Leading Language Management
>          >>
>          >> www.lingsoft.fi
>          >>
>          >> Speech Applications - Language Management - Translation - Reader's and
>          >> Writer's Tools - Text Tools - E-books and M-books
>          >>
>          >> Mikael Pesonen
>          >> System Engineer
>          >>
>          >> e-mail: mikael.pesonen@lingsoft.fi
>          >> Tel. +358 2 279 3300
>          >>
>          >> Time zone: GMT+2
>          >>
>          >> Helsinki Office
>          >> Eteläranta 10
>          >> FI-00130 Helsinki
>          >> FINLAND
>          >>
>          >> Turku Office
>          >> Kauppiaskatu 5 A
>          >> FI-20100 Turku
>          >> FINLAND
>          >>
>          >>
>          
>          --
>          Lingsoft - 30 years of Leading Language Management
>          
>          www.lingsoft.fi
>          
>          Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books
>          
>          Mikael Pesonen
>          System Engineer
>          
>          e-mail: mikael.pesonen@lingsoft.fi
>          Tel. +358 2 279 3300
>          
>          Time zone: GMT+2
>          
>          Helsinki Office
>          Eteläranta 10
>          FI-00130 Helsinki
>          FINLAND
>          
>          Turku Office
>          Kauppiaskatu 5 A
>          FI-20100 Turku
>          FINLAND
>          
>          
>      
>      
>      
>      
>      
>
>
>
>

-- 
Lingsoft - 30 years of Leading Language Management

www.lingsoft.fi

Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books

Mikael Pesonen
System Engineer

e-mail: mikael.pesonen@lingsoft.fi
Tel. +358 2 279 3300

Time zone: GMT+2

Helsinki Office
Eteläranta 10
FI-00130 Helsinki
FINLAND

Turku Office
Kauppiaskatu 5 A
FI-20100 Turku
FINLAND


Re: Out of memory

Posted by Rob Vesse <rv...@dotnetrdf.org>.
And I realise browsing back through the thread that you mentioned that you don't have a desktop in a previous reply.  So I presume you mean you only have terminal access to the machine where you are running Fuseki?

In which case you might want to try out jvmtop - https://github.com/patric-r/jvmtop - as an open source command line based JVM profiler

Rob

On 05/02/2019, 15:44, "Rob Vesse" <rv...@dotnetrdf.org> wrote:

    -Xms and -Xmx do two different things (the previous email in the thread mentioned -Xmx but then you referenced -Xms in your question).
    
    The former sets the minimum heap size which is the minimum amount of memory the JVM will allocate for the heap when it starts
    
    The latter sets the maximum heap size which is the maximum amount of memory the JVM will allocate for the heap during runtime.  The heap may start smaller than this and grow up to this maximum
    
    When one/both of these is not set your JVM chooses default values, usually based upon some percentage of the system memory.  Exact behaviour will vary between JVMs.
    
    As I think has been suggested earlier in this thread if you are continuing to have issues with memory consumption your best bet to investigate further is to attach a JVM profiler to the running Fuseki process.  With that you can take Snapshots of the memory usage over time and inspect them to see where the memory consumption is going.
    
    Visual VM - https://visualvm.github.io - is one such free tool, there are of course other free and proprietary JVM profilers available.
    
    Rob
    
    On 05/02/2019, 11:07, "Mikael Pesonen" <mi...@lingsoft.fi> wrote:
    
        
        Tested with 16GB, and java mem usage goes up to 10G (virt 14G). 
        Wondering what does the java -Xms do actually...
        
        There was no way to limit mem usage for 8GB server?
        
        
        On 29/01/2019 19:41, Dan Pritts wrote:
        > It's often misunderstood, but Java programs use memory in addition to the
        > configured heap.  Fuseki in my experience sometimes uses a LOT more, more
        > than I could explain.  Some of the folks here (Andy for sure) spent some
        > time looking at it with me and weren't able to come to any conclusions.
        > You can look throught he list archives for the discussion, maybe 6 months
        > ago.
        >
        > I ended up significantly overallocating memory to the instance and being
        > done with it.
        >
        > How much RAM does your instance have?  You mentioned -Xmx 5600, and total
        > usage of 17GB ram+swap - sounds like you have maybe 8GB ram?    I'd try
        > 16GB and see how it does; watch the total memory usage.
        >
        >
        >
        > On Tue, Jan 29, 2019 at 9:43 AM Mikael Pesonen <mi...@lingsoft.fi>
        > wrote:
        >
        >>
        >>
        >> On 29/01/2019 16:28, Rob Vesse wrote:
        >>> This may be partly a case of a simple looking query having unexpected
        >> execution semantics.  Strictly speaking your query says select all triples
        >> in the specific graph then join them with these list of values for ?s.  Now
        >> the optimiser should, and does appear, to do the right thing and flip the
        >> join order i.e. it uses the concrete values from the VALUES block to search
        >> for triples with those subjects in the specific graph.  However if the
        >> query had other elements involved the optimiser might not kick in, a better
        >> query would place the VALUES prior to using the variables defined in the
        >> VALUES block.
        >> Thanks for the reminder on VALUES order
        >>> This sounds like memory/cache thrashing.  From what you have described,
        >> running variants on this query 50k times, you are basically walking over
        >> your entire dataset extracting it piece by piece?
        >> Dataset is larger, these small sets (VALUES) are coming from out
        >> external index for similar document search. Index returns id and related
        >> metadata is fetched from Jena.
        >>> Assuming the Graph URI and the URIs in your VALUES block change in each
        >> query then every query is looking at a different section of the database
        >> causing a lot of data to be cached and then evicted both in terms of
        >> on-heap memory structures (the node table cache) and potentially also for
        >> the off heap memory mapped files which may be being paged in and out as the
        >> code traverses the B-Tree indexes.
        >>> Is there also some other query involved that extracts the Graph URIs and
        >> Subject URIs of interest that is being executed in parallel with the
        >> script?  Or has the input from the script been pre-calculated ahead of
        >> time, comes from elsewhere etc?
        >> There is no parrallelism from our part in this case. Only one php script
        >> running and making GSP calls.
        >>> Rob
        >>>
        >>> On 29/01/2019, 14:06, "Mikael Pesonen" <mi...@lingsoft.fi>
        >> wrote:
        >>>
        >>>       Server:
        >>>
        >>>       /usr/bin/java
        >>>
        >> -Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties
        >>>       -Xmx5600M -jar fuseki-server.jar --update --port 3030
        >>>       --loc=/home/text/tools/jena_data_test/ /ds
        >>>
        >>>       No custom configs, default installation package.
        >>>
        >>>
        >>>       Sparql similar to this (returns 5-10 triplets) :
        >>>
        >>>       CONSTRUCT { ?s ?p ?o }
        >>>       FROM <
        >> https://resource.lingsoft.fi/4f13c609-48b4-4e4d-a40b-2d7946f88234/>
        >>>       WHERE
        >>>       {
        >>>                ?s ?p ?o
        >>>
        >>>       VALUES ?s {lsr:10609f75-5cf3-4544-8fc1-c361778c3bd8
        >>>       lsr:88d0bb8c-35d8-4051-a27d-a0d93af77985
        >>>       lsr:fc7b2c65-453e-469b-9c5d-8c7ee4ee6902
        >>>       lsr:239c6da0-4c24-4539-a277-c9756d6257ee
        >>>       lsr:2ef0190d-6271-447a-992f-6225fc440897
        >>>       lsr:6aaf601c-ccf4-4e59-9757-1a463db49fa9
        >>>       lsr:d7c9dc96-cd61-4a31-b466-bb2491a3ceaf
        >>>       lsr:6f6802cf-0336-4234-90b8-cc8780058f0d
        >>>       lsr:d1e2751b-4332-4d57-95e4-ca8070c16782
        >>>       lsr:81053775-4722-4a00-b3f7-33d4feb3629b}
        >>>       }
        >>>
        >>>
        >>>       I solved this by adding sleep to script. So I guess it's about the
        >> java
        >>>       memory manager not getting time to free memory? Even with sleep it
        >> was
        >>>       barely doable, memory consumption changing rapidly between 1,5 gig
        >> - 6 gig.
        >>>
        >>>
        >>>       On 29/01/2019 15:50, Andy Seaborne wrote:
        >>>       > Mikael,
        >>>       >
        >>>       > There aren't enough details except to mention the suspects like
        >> sorting.
        >>>       >
        >>>       > With all the questions on the list, I personally don't track the
        >>>       > details of each installation so please also remind me of your
        >> current
        >>>       > setup.
        >>>       >
        >>>       >     Andy
        >>>       >
        >>>       > On 29/01/2019 11:32, Mikael Pesonen wrote:
        >>>       >>
        >>>       >> I'm not able to run a basic read-only script without running out
        >> of
        >>>       >> memory on the server.
        >>>       >>
        >>>       >> Consumption goes to 7+gigs (VM 10+ gigs), then system kills
        >> Fuseki
        >>>       >> when running out of memory.
        >>>       >> All I'm running is simple sparql query getting few triples of
        >>>       >> resource. This is run for about 50k times.
        >>>       >>
        >>>       >> All settings are default, using GSP.
        >>>       >>
        >>>       >>
        >>>
        >>>       --
        >>>       Lingsoft - 30 years of Leading Language Management
        >>>
        >>>       www.lingsoft.fi
        >>>
        >>>       Speech Applications - Language Management - Translation - Reader's
        >> and Writer's Tools - Text Tools - E-books and M-books
        >>>       Mikael Pesonen
        >>>       System Engineer
        >>>
        >>>       e-mail: mikael.pesonen@lingsoft.fi
        >>>       Tel. +358 2 279 3300
        >>>
        >>>       Time zone: GMT+2
        >>>
        >>>       Helsinki Office
        >>>       Eteläranta 10
        >>>       FI-00130 Helsinki
        >>>       FINLAND
        >>>
        >>>       Turku Office
        >>>       Kauppiaskatu 5 A
        >>>       FI-20100 Turku
        >>>       FINLAND
        >>>
        >>>
        >>>
        >>>
        >>>
        >>>
        >> --
        >> Lingsoft - 30 years of Leading Language Management
        >>
        >> www.lingsoft.fi
        >>
        >> Speech Applications - Language Management - Translation - Reader's and
        >> Writer's Tools - Text Tools - E-books and M-books
        >>
        >> Mikael Pesonen
        >> System Engineer
        >>
        >> e-mail: mikael.pesonen@lingsoft.fi
        >> Tel. +358 2 279 3300
        >>
        >> Time zone: GMT+2
        >>
        >> Helsinki Office
        >> Eteläranta 10
        >> FI-00130 Helsinki
        >> FINLAND
        >>
        >> Turku Office
        >> Kauppiaskatu 5 A
        >> FI-20100 Turku
        >> FINLAND
        >>
        >>
        
        -- 
        Lingsoft - 30 years of Leading Language Management
        
        www.lingsoft.fi
        
        Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books
        
        Mikael Pesonen
        System Engineer
        
        e-mail: mikael.pesonen@lingsoft.fi
        Tel. +358 2 279 3300
        
        Time zone: GMT+2
        
        Helsinki Office
        Eteläranta 10
        FI-00130 Helsinki
        FINLAND
        
        Turku Office
        Kauppiaskatu 5 A
        FI-20100 Turku
        FINLAND
        
        
    
    
    
    
    





Re: Out of memory

Posted by Rob Vesse <rv...@dotnetrdf.org>.
-Xms and -Xmx do two different things (the previous email in the thread mentioned -Xmx but then you referenced -Xms in your question).

The former sets the minimum heap size which is the minimum amount of memory the JVM will allocate for the heap when it starts

The latter sets the maximum heap size which is the maximum amount of memory the JVM will allocate for the heap during runtime.  The heap may start smaller than this and grow up to this maximum

When one/both of these is not set your JVM chooses default values, usually based upon some percentage of the system memory.  Exact behaviour will vary between JVMs.

As I think has been suggested earlier in this thread if you are continuing to have issues with memory consumption your best bet to investigate further is to attach a JVM profiler to the running Fuseki process.  With that you can take Snapshots of the memory usage over time and inspect them to see where the memory consumption is going.

Visual VM - https://visualvm.github.io - is one such free tool, there are of course other free and proprietary JVM profilers available.

Rob

On 05/02/2019, 11:07, "Mikael Pesonen" <mi...@lingsoft.fi> wrote:

    
    Tested with 16GB, and java mem usage goes up to 10G (virt 14G). 
    Wondering what does the java -Xms do actually...
    
    There was no way to limit mem usage for 8GB server?
    
    
    On 29/01/2019 19:41, Dan Pritts wrote:
    > It's often misunderstood, but Java programs use memory in addition to the
    > configured heap.  Fuseki in my experience sometimes uses a LOT more, more
    > than I could explain.  Some of the folks here (Andy for sure) spent some
    > time looking at it with me and weren't able to come to any conclusions.
    > You can look throught he list archives for the discussion, maybe 6 months
    > ago.
    >
    > I ended up significantly overallocating memory to the instance and being
    > done with it.
    >
    > How much RAM does your instance have?  You mentioned -Xmx 5600, and total
    > usage of 17GB ram+swap - sounds like you have maybe 8GB ram?    I'd try
    > 16GB and see how it does; watch the total memory usage.
    >
    >
    >
    > On Tue, Jan 29, 2019 at 9:43 AM Mikael Pesonen <mi...@lingsoft.fi>
    > wrote:
    >
    >>
    >>
    >> On 29/01/2019 16:28, Rob Vesse wrote:
    >>> This may be partly a case of a simple looking query having unexpected
    >> execution semantics.  Strictly speaking your query says select all triples
    >> in the specific graph then join them with these list of values for ?s.  Now
    >> the optimiser should, and does appear, to do the right thing and flip the
    >> join order i.e. it uses the concrete values from the VALUES block to search
    >> for triples with those subjects in the specific graph.  However if the
    >> query had other elements involved the optimiser might not kick in, a better
    >> query would place the VALUES prior to using the variables defined in the
    >> VALUES block.
    >> Thanks for the reminder on VALUES order
    >>> This sounds like memory/cache thrashing.  From what you have described,
    >> running variants on this query 50k times, you are basically walking over
    >> your entire dataset extracting it piece by piece?
    >> Dataset is larger, these small sets (VALUES) are coming from out
    >> external index for similar document search. Index returns id and related
    >> metadata is fetched from Jena.
    >>> Assuming the Graph URI and the URIs in your VALUES block change in each
    >> query then every query is looking at a different section of the database
    >> causing a lot of data to be cached and then evicted both in terms of
    >> on-heap memory structures (the node table cache) and potentially also for
    >> the off heap memory mapped files which may be being paged in and out as the
    >> code traverses the B-Tree indexes.
    >>> Is there also some other query involved that extracts the Graph URIs and
    >> Subject URIs of interest that is being executed in parallel with the
    >> script?  Or has the input from the script been pre-calculated ahead of
    >> time, comes from elsewhere etc?
    >> There is no parrallelism from our part in this case. Only one php script
    >> running and making GSP calls.
    >>> Rob
    >>>
    >>> On 29/01/2019, 14:06, "Mikael Pesonen" <mi...@lingsoft.fi>
    >> wrote:
    >>>
    >>>       Server:
    >>>
    >>>       /usr/bin/java
    >>>
    >> -Dlog4j.configuration=file:/home/text/tools/apache-jena-fuseki-3.9.0/log4j.properties
    >>>       -Xmx5600M -jar fuseki-server.jar --update --port 3030
    >>>       --loc=/home/text/tools/jena_data_test/ /ds
    >>>
    >>>       No custom configs, default installation package.
    >>>
    >>>
    >>>       Sparql similar to this (returns 5-10 triplets) :
    >>>
    >>>       CONSTRUCT { ?s ?p ?o }
    >>>       FROM <
    >> https://resource.lingsoft.fi/4f13c609-48b4-4e4d-a40b-2d7946f88234/>
    >>>       WHERE
    >>>       {
    >>>                ?s ?p ?o
    >>>
    >>>       VALUES ?s {lsr:10609f75-5cf3-4544-8fc1-c361778c3bd8
    >>>       lsr:88d0bb8c-35d8-4051-a27d-a0d93af77985
    >>>       lsr:fc7b2c65-453e-469b-9c5d-8c7ee4ee6902
    >>>       lsr:239c6da0-4c24-4539-a277-c9756d6257ee
    >>>       lsr:2ef0190d-6271-447a-992f-6225fc440897
    >>>       lsr:6aaf601c-ccf4-4e59-9757-1a463db49fa9
    >>>       lsr:d7c9dc96-cd61-4a31-b466-bb2491a3ceaf
    >>>       lsr:6f6802cf-0336-4234-90b8-cc8780058f0d
    >>>       lsr:d1e2751b-4332-4d57-95e4-ca8070c16782
    >>>       lsr:81053775-4722-4a00-b3f7-33d4feb3629b}
    >>>       }
    >>>
    >>>
    >>>       I solved this by adding sleep to script. So I guess it's about the
    >> java
    >>>       memory manager not getting time to free memory? Even with sleep it
    >> was
    >>>       barely doable, memory consumption changing rapidly between 1,5 gig
    >> - 6 gig.
    >>>
    >>>
    >>>       On 29/01/2019 15:50, Andy Seaborne wrote:
    >>>       > Mikael,
    >>>       >
    >>>       > There aren't enough details except to mention the suspects like
    >> sorting.
    >>>       >
    >>>       > With all the questions on the list, I personally don't track the
    >>>       > details of each installation so please also remind me of your
    >> current
    >>>       > setup.
    >>>       >
    >>>       >     Andy
    >>>       >
    >>>       > On 29/01/2019 11:32, Mikael Pesonen wrote:
    >>>       >>
    >>>       >> I'm not able to run a basic read-only script without running out
    >> of
    >>>       >> memory on the server.
    >>>       >>
    >>>       >> Consumption goes to 7+gigs (VM 10+ gigs), then system kills
    >> Fuseki
    >>>       >> when running out of memory.
    >>>       >> All I'm running is simple sparql query getting few triples of
    >>>       >> resource. This is run for about 50k times.
    >>>       >>
    >>>       >> All settings are default, using GSP.
    >>>       >>
    >>>       >>
    >>>
    >>>       --
    >>>       Lingsoft - 30 years of Leading Language Management
    >>>
    >>>       www.lingsoft.fi
    >>>
    >>>       Speech Applications - Language Management - Translation - Reader's
    >> and Writer's Tools - Text Tools - E-books and M-books
    >>>       Mikael Pesonen
    >>>       System Engineer
    >>>
    >>>       e-mail: mikael.pesonen@lingsoft.fi
    >>>       Tel. +358 2 279 3300
    >>>
    >>>       Time zone: GMT+2
    >>>
    >>>       Helsinki Office
    >>>       Eteläranta 10
    >>>       FI-00130 Helsinki
    >>>       FINLAND
    >>>
    >>>       Turku Office
    >>>       Kauppiaskatu 5 A
    >>>       FI-20100 Turku
    >>>       FINLAND
    >>>
    >>>
    >>>
    >>>
    >>>
    >>>
    >> --
    >> Lingsoft - 30 years of Leading Language Management
    >>
    >> www.lingsoft.fi
    >>
    >> Speech Applications - Language Management - Translation - Reader's and
    >> Writer's Tools - Text Tools - E-books and M-books
    >>
    >> Mikael Pesonen
    >> System Engineer
    >>
    >> e-mail: mikael.pesonen@lingsoft.fi
    >> Tel. +358 2 279 3300
    >>
    >> Time zone: GMT+2
    >>
    >> Helsinki Office
    >> Eteläranta 10
    >> FI-00130 Helsinki
    >> FINLAND
    >>
    >> Turku Office
    >> Kauppiaskatu 5 A
    >> FI-20100 Turku
    >> FINLAND
    >>
    >>
    
    -- 
    Lingsoft - 30 years of Leading Language Management
    
    www.lingsoft.fi
    
    Speech Applications - Language Management - Translation - Reader's and Writer's Tools - Text Tools - E-books and M-books
    
    Mikael Pesonen
    System Engineer
    
    e-mail: mikael.pesonen@lingsoft.fi
    Tel. +358 2 279 3300
    
    Time zone: GMT+2
    
    Helsinki Office
    Eteläranta 10
    FI-00130 Helsinki
    FINLAND
    
    Turku Office
    Kauppiaskatu 5 A
    FI-20100 Turku
    FINLAND