You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Zen 98052 <z9...@outlook.com> on 2016/01/04 20:12:32 UTC

long update query string causing stack overflow

Hi,

I have a big INSERT DATA query, which it has about 20K triples.

I passed the query string to UpdateFactory.create(), and it threw exception.


at com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update._parse(ParserSPARQL11Update.java:80)
at com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update.parse$(ParserSPARQL11Update.java:41)
at com.hp.hpl.jena.sparql.lang.UpdateParser.parse(UpdateParser.java:39)
at com.hp.hpl.jena.update.UpdateFactory.make(UpdateFactory.java:88)
at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:79)
at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:57)
at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:47)
...
Caused by: java.lang.StackOverflowError
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveNfa_0(SPARQLParser11TokenManager.java:2216)
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStringLiteralDfa2_0(SPARQLParser11TokenManager.java:421)
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStringLiteralDfa1_0(SPARQLParser11TokenManager.java:341)
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStringLiteralDfa0_0(SPARQLParser11TokenManager.java:151)
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.getNextToken(SPARQLParser11TokenManager.java:3753)
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.jj_ntk(SPARQLParser11.java:5026)
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.Verb(SPARQLParser11.java:2535)
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.PropertyListNotEmpty(SPARQLParser11.java:2503)
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesSameSubject(SPARQLParser11.java:2469)
at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesTemplate(SPARQLParser11.java:1619)

Is there a workaround for this, besides breaking down the query (tried with 5K triples and it works fine)?


Thanks,

Z


Re: long update query string causing stack overflow

Posted by Zen 98052 <z9...@outlook.com>.
Thanks all for the replies.

@Andy: Yes, it's old version (still with com.hp.hpl.* namespace). I changed the code to:
UpdateAction.parseExecute(null, graphDataset, stream, Syntax.syntaxARQ);
and it works fine, so I am good for now. Thanks a lot!

@Rob: UpdateAction.parseExecute also gave same exception, unless I pass the syntax as Andy mentioned. I can't copy/paste the query to pastebin since I can't access the site from office, but i simply created the query by copying the data from a .nt file and enclose them within 'INSERT DATA { }'
@Lorenz: I haven't tried changing the JVM option, since I already have the solution.

________________________________________
From: Andy Seaborne <an...@apache.org>
Sent: Tuesday, January 5, 2016 6:33 AM
To: users@jena.apache.org
Subject: Re: long update query string causing stack overflow

On 05/01/16 09:50, Rob Vesse wrote:
> What version of Jena is this?
>
> Trying to parse large updates into memory always risks potentially hitting
> memory issues though the fact you get StackOverflowError seems a little
> odd.  Can you share the update via a Gist/Pastebin/etc?
>
> Depending on how you want to evaluate updates ARQ does support processing
> updates in a pure streaming fashion, this is how Fuseki can accept
> arbitrarily large updates.  To do this you need to use one of the variants
> of UpdateAction.parseExecute() - please try that and see if it resolves
> the issue.
>
> Note that this only works if you are updating data exposed via the
> DatasetGraph/Graph/Model interfaces so will not be of any use if you are
> trying to update a remote store via HTTP
>
> Rob

Presumably this stacktrace is not from inside Fuseki.


Try parsing with "Syntax.syntaxARQ".
Or globally set "Syntax.defaultUpdateSyntax" to that value.


The "ARQ" language is a superset of SPARQL and it also includes grammar
improvements for some SPARQL forms including DATA.

When using strict SPARQL, the parser is also following strictly the way
it is written in the spec.  The spec grammar is simple parsing (it's an
LL(1) grammar) but it is recursive on TriplesTemplate.

When using the ARQ form, the recursive points are written - same
Abstract Syntax Tree output, different way to get there and it uses a
local lookahead of 2. [*]

I tested (Jena3; you have some kind of Jena2) and parsed 25K triples in
INSERT DATA and it works for ARQ where it does not work for SPARQL.
IIRC that grammar rewrite is quite old.  It may be in your version.

Fuseki accepts "ARQ".  Stackoverflow shouldn't happen for this.

        Andy

[*]

void TriplesTemplate(TripleCollector acc) : { }
{    // same as ConstructTriples
#if SPARQL_11
     // Version for the spec.
     TriplesSameSubject(acc)
     (<DOT> (TriplesTemplate(acc))?)?
#endif
#ifdef ARQ
     // Rewrite for no recursion - grammar is not LL(1)
     TriplesSameSubject(acc)
     (LOOKAHEAD(2) (<DOT>) TriplesSameSubject(acc))*
     (<DOT>)?
#endif
}


>
> On 04/01/2016 19:12, "Zen 98052" <z9...@outlook.com> wrote:
>
>> Hi,
>>
>> I have a big INSERT DATA query, which it has about 20K triples.
>>
>> I passed the query string to UpdateFactory.create(), and it threw
>> exception.
>>
>>
>> at
>> com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update._parse(ParserSPARQL11Upda
>> te.java:80)
>> at
>> com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update.parse$(ParserSPARQL11Upda
>> te.java:41)
>> at com.hp.hpl.jena.sparql.lang.UpdateParser.parse(UpdateParser.java:39)
>> at com.hp.hpl.jena.update.UpdateFactory.make(UpdateFactory.java:88)
>> at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:79)
>> at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:57)
>> at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:47)
>> ...
>> Caused by: java.lang.StackOverflowError
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveNfa
>> _0(SPARQLParser11TokenManager.java:2216)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStr
>> ingLiteralDfa2_0(SPARQLParser11TokenManager.java:421)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStr
>> ingLiteralDfa1_0(SPARQLParser11TokenManager.java:341)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStr
>> ingLiteralDfa0_0(SPARQLParser11TokenManager.java:151)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.getNextTo
>> ken(SPARQLParser11TokenManager.java:3753)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.jj_ntk(SPARQLParser11
>> .java:5026)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.Verb(SPARQLParser11.j
>> ava:2535)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.PropertyListNotEmpty(
>> SPARQLParser11.java:2503)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesSameSubject(SP
>> ARQLParser11.java:2469)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesTemplate(SPARQ
>> LParser11.java:1619)
>>
>> Is there a workaround for this, besides breaking down the query (tried
>> with 5K triples and it works fine)?
>>
>>
>> Thanks,
>>
>> Z
>>
>
>
>
>


Re: long update query string causing stack overflow

Posted by Andy Seaborne <an...@apache.org>.
On 05/01/16 09:50, Rob Vesse wrote:
> What version of Jena is this?
>
> Trying to parse large updates into memory always risks potentially hitting
> memory issues though the fact you get StackOverflowError seems a little
> odd.  Can you share the update via a Gist/Pastebin/etc?
>
> Depending on how you want to evaluate updates ARQ does support processing
> updates in a pure streaming fashion, this is how Fuseki can accept
> arbitrarily large updates.  To do this you need to use one of the variants
> of UpdateAction.parseExecute() - please try that and see if it resolves
> the issue.
>
> Note that this only works if you are updating data exposed via the
> DatasetGraph/Graph/Model interfaces so will not be of any use if you are
> trying to update a remote store via HTTP
>
> Rob

Presumably this stacktrace is not from inside Fuseki.


Try parsing with "Syntax.syntaxARQ".
Or globally set "Syntax.defaultUpdateSyntax" to that value.


The "ARQ" language is a superset of SPARQL and it also includes grammar 
improvements for some SPARQL forms including DATA.

When using strict SPARQL, the parser is also following strictly the way 
it is written in the spec.  The spec grammar is simple parsing (it's an 
LL(1) grammar) but it is recursive on TriplesTemplate.

When using the ARQ form, the recursive points are written - same 
Abstract Syntax Tree output, different way to get there and it uses a 
local lookahead of 2. [*]

I tested (Jena3; you have some kind of Jena2) and parsed 25K triples in 
INSERT DATA and it works for ARQ where it does not work for SPARQL. 
IIRC that grammar rewrite is quite old.  It may be in your version.

Fuseki accepts "ARQ".  Stackoverflow shouldn't happen for this.

	Andy

[*]

void TriplesTemplate(TripleCollector acc) : { }
{    // same as ConstructTriples
#if SPARQL_11
     // Version for the spec.
     TriplesSameSubject(acc)
     (<DOT> (TriplesTemplate(acc))?)?
#endif
#ifdef ARQ
     // Rewrite for no recursion - grammar is not LL(1)
     TriplesSameSubject(acc)
     (LOOKAHEAD(2) (<DOT>) TriplesSameSubject(acc))*
     (<DOT>)?
#endif
}


>
> On 04/01/2016 19:12, "Zen 98052" <z9...@outlook.com> wrote:
>
>> Hi,
>>
>> I have a big INSERT DATA query, which it has about 20K triples.
>>
>> I passed the query string to UpdateFactory.create(), and it threw
>> exception.
>>
>>
>> at
>> com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update._parse(ParserSPARQL11Upda
>> te.java:80)
>> at
>> com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update.parse$(ParserSPARQL11Upda
>> te.java:41)
>> at com.hp.hpl.jena.sparql.lang.UpdateParser.parse(UpdateParser.java:39)
>> at com.hp.hpl.jena.update.UpdateFactory.make(UpdateFactory.java:88)
>> at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:79)
>> at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:57)
>> at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:47)
>> ...
>> Caused by: java.lang.StackOverflowError
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveNfa
>> _0(SPARQLParser11TokenManager.java:2216)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStr
>> ingLiteralDfa2_0(SPARQLParser11TokenManager.java:421)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStr
>> ingLiteralDfa1_0(SPARQLParser11TokenManager.java:341)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStr
>> ingLiteralDfa0_0(SPARQLParser11TokenManager.java:151)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.getNextTo
>> ken(SPARQLParser11TokenManager.java:3753)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.jj_ntk(SPARQLParser11
>> .java:5026)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.Verb(SPARQLParser11.j
>> ava:2535)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.PropertyListNotEmpty(
>> SPARQLParser11.java:2503)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesSameSubject(SP
>> ARQLParser11.java:2469)
>> at
>> com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesTemplate(SPARQ
>> LParser11.java:1619)
>>
>> Is there a workaround for this, besides breaking down the query (tried
>> with 5K triples and it works fine)?
>>
>>
>> Thanks,
>>
>> Z
>>
>
>
>
>


Re: long update query string causing stack overflow

Posted by Rob Vesse <rv...@dotnetrdf.org>.
What version of Jena is this?

Trying to parse large updates into memory always risks potentially hitting
memory issues though the fact you get StackOverflowError seems a little
odd.  Can you share the update via a Gist/Pastebin/etc?

Depending on how you want to evaluate updates ARQ does support processing
updates in a pure streaming fashion, this is how Fuseki can accept
arbitrarily large updates.  To do this you need to use one of the variants
of UpdateAction.parseExecute() - please try that and see if it resolves
the issue.

Note that this only works if you are updating data exposed via the
DatasetGraph/Graph/Model interfaces so will not be of any use if you are
trying to update a remote store via HTTP

Rob

On 04/01/2016 19:12, "Zen 98052" <z9...@outlook.com> wrote:

>Hi,
>
>I have a big INSERT DATA query, which it has about 20K triples.
>
>I passed the query string to UpdateFactory.create(), and it threw
>exception.
>
>
>at 
>com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update._parse(ParserSPARQL11Upda
>te.java:80)
>at 
>com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update.parse$(ParserSPARQL11Upda
>te.java:41)
>at com.hp.hpl.jena.sparql.lang.UpdateParser.parse(UpdateParser.java:39)
>at com.hp.hpl.jena.update.UpdateFactory.make(UpdateFactory.java:88)
>at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:79)
>at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:57)
>at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:47)
>...
>Caused by: java.lang.StackOverflowError
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveNfa
>_0(SPARQLParser11TokenManager.java:2216)
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStr
>ingLiteralDfa2_0(SPARQLParser11TokenManager.java:421)
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStr
>ingLiteralDfa1_0(SPARQLParser11TokenManager.java:341)
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStr
>ingLiteralDfa0_0(SPARQLParser11TokenManager.java:151)
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.getNextTo
>ken(SPARQLParser11TokenManager.java:3753)
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.jj_ntk(SPARQLParser11
>.java:5026)
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.Verb(SPARQLParser11.j
>ava:2535)
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.PropertyListNotEmpty(
>SPARQLParser11.java:2503)
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesSameSubject(SP
>ARQLParser11.java:2469)
>at 
>com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesTemplate(SPARQ
>LParser11.java:1619)
>
>Is there a workaround for this, besides breaking down the query (tried
>with 5K triples and it works fine)?
>
>
>Thanks,
>
>Z
>





Re: long update query string causing stack overflow

Posted by buehmann <bu...@informatik.uni-leipzig.de>.
I think this is JVM specific and if you have to many recursive calls the 
stack is too small. Maybe you can have a look at the JVM option -Xss and 
play around with it before splitting the query.

Lorenz

On 04.01.2016 20:12, Zen 98052 wrote:
> Hi,
>
> I have a big INSERT DATA query, which it has about 20K triples.
>
> I passed the query string to UpdateFactory.create(), and it threw exception.
>
>
> at com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update._parse(ParserSPARQL11Update.java:80)
> at com.hp.hpl.jena.sparql.lang.ParserSPARQL11Update.parse$(ParserSPARQL11Update.java:41)
> at com.hp.hpl.jena.sparql.lang.UpdateParser.parse(UpdateParser.java:39)
> at com.hp.hpl.jena.update.UpdateFactory.make(UpdateFactory.java:88)
> at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:79)
> at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:57)
> at com.hp.hpl.jena.update.UpdateFactory.create(UpdateFactory.java:47)
> ...
> Caused by: java.lang.StackOverflowError
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveNfa_0(SPARQLParser11TokenManager.java:2216)
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStringLiteralDfa2_0(SPARQLParser11TokenManager.java:421)
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStringLiteralDfa1_0(SPARQLParser11TokenManager.java:341)
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.jjMoveStringLiteralDfa0_0(SPARQLParser11TokenManager.java:151)
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11TokenManager.getNextToken(SPARQLParser11TokenManager.java:3753)
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.jj_ntk(SPARQLParser11.java:5026)
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.Verb(SPARQLParser11.java:2535)
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.PropertyListNotEmpty(SPARQLParser11.java:2503)
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesSameSubject(SPARQLParser11.java:2469)
> at com.hp.hpl.jena.sparql.lang.sparql_11.SPARQLParser11.TriplesTemplate(SPARQLParser11.java:1619)
>
> Is there a workaround for this, besides breaking down the query (tried with 5K triples and it works fine)?
>
>
> Thanks,
>
> Z
>
>