You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by David Habgood <dc...@gmail.com> on 2022/08/22 13:33:07 UTC

Scalability of Jena GeoSPARQL

Hi,

I've been using Jena GeoSPARQL, and a dockerised copy of the spatial
indexer (https://github.com/zazuko/spatial-indexer - thanks guys, very
useful!). I've been running Jena GeoSPARQL fine with a spatial index for
smaller datasets.

I have a spatial dataset that is around 160 GB of nquads (uncompressed), of
which, at a guess 5% of the triples are geometry literals. Using the Jena
spatial indexer generates a spatial index close to 4 GB. I've been unable
to start a Jena GeoSPARQL instance for this dataset. I get out of memory
errors on startup. I've tried different heap values. The infrastructure
I've used is the largest AWS Fargate task available, 4 vCPU and 30 GB RAM.

Could anyone hazard a guess as to what infrastructure sizing would be
required for this dataset to run, and/or changes I could make to the
configuration (attached) that might allow it to start.

Thanks