You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Rafiq <ak...@univ-lyon1.fr> on 2013/12/16 11:38:00 UTC
Looking for suggestions if you have
To Andy,
I need suggestions on two issues:
(i) I launched TDB loader to upload 159.(something) million triples, it
took 357636.22 seconds to complete (!!!).
To me, TDB loader consumed an enormous amount of time because I
have 1.3 billion triples to load. In addition to
processing time, loading time is critical for me, I choose TDB
because I was impressed by the benchmark you(Andy)
published in (http://www.w3.org/wiki/LargeTripleStores). I am sure
there is a way out which I don't know but you know.
I would be really grateful if you could help me.
(*Machine* Info: I am using 64 bit machine with 8 GBntel(R)
Core(TM) i5-2400 CPU @ 3.10GHz. Its a Quad core machine
with Linux OS.)
(ii) I am using LUBM dataset and running LUBM queries. Unfortunately,
when I launch a query, the system throws a
*400 error* which essentially is bad request error. According to my
understanding, the server finds a syntactic mismatch
but I have got no idea why does it happen. I am using Fuseqi
SPARQL interface.
Thanking you in advance for your suggestions.
Regards,
Rafiq
Re: Looking for suggestions if you have
Posted by Andy Seaborne <an...@apache.org>.
On 16/12/13 10:38, Rafiq wrote:
> To Andy,
>
> I need suggestions on two issues:
>
> (i) I launched TDB loader to upload 159.(something) million triples, it
> took 357636.22 seconds to complete (!!!).
> To me, TDB loader consumed an enormous amount of time because I
> have 1.3 billion triples to load. In addition to
> processing time, loading time is critical for me, I choose TDB
> because I was impressed by the benchmark you(Andy)
> published in (http://www.w3.org/wiki/LargeTripleStores). I am sure
> there is a way out which I don't know but you know.
> I would be really grateful if you could help me.
>
> (*Machine* Info: I am using 64 bit machine with 8 GBntel(R)
> Core(TM) i5-2400 CPU @ 3.10GHz. Its a Quad core machine
> with Linux OS.)
It's probably because of the 8G. If you can borrow a larger machine
just for the load, then it should go faster. I loaded 410e6 recently
overnight on a 30G machine so, subject to what your data profile is,
16G-24G would be better. And a small heap - 2G?
It does sound like you've pushed over the edge. Because disk is
involved, if it starts hitting disk too much, performance drops sharply.
tdbloader2 *may* help but it's not certain to.
> (ii) I am using LUBM dataset and running LUBM queries. Unfortunately,
> when I launch a query, the system throws a
> *400 error* which essentially is bad request error. According to my
> understanding, the server finds a syntactic mismatch
> but I have got no idea why does it happen. I am using Fuseqi
> SPARQL interface.
Try using the online query validator as it prints the parse error. That
is also in the response body for the 400.
>
> Thanking you in advance for your suggestions.
>
> Regards,
> Rafiq
Andy