You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Qiaser Mehmood <qm...@yahoo.com> on 2014/05/24 21:17:52 UTC

Uncontrollable Loop with warning (WARN riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ')

Hey, 
I got a strange situation while querying an endpoint by running jena execConstruct (),  OR  execConstructTriples().

The issue is uncontrollable loop  (WARN  riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ') by querying an endpoint http://roma.rkbexplorer.com/sparql/  and the actual query i use to run is 

PREFIX void: <http://rdfs.org/ns/void#> CONSTRUCT { <http://roma.rkbexplorer.com/sparql#dataset> void:classPartition [ void:class ?c ; void:propertyPartition [ void:property ?p ] ] } WHERE { ?s a ?c ; ?p ?o .}

the http query generated by jena QueryExecutionFactory.sparqlService(endpoint, query) is as follow:
http://roma.rkbexplorer.com/sparql/?query=PREFIX++void%3A+%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E%0A%0ACONSTRUCT+%0A++%7B+%3Chttp%3A%2F%2Froma.rkbexplorer.com%2Fsparql%2F%23dataset%3E+void%3AclassPartition+_%3Ac0+.%0A++++_%3Ac0+void%3Aclass+%3Fc+.%0A++++_%3Ac0+void%3ApropertyPartition+_%3Ac1+.%0A++++_%3Ac1+void%3Aproperty+%3Fp+.%7D%0AWHERE%0A++%7B+%3Fs+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E+%3Fc+.%0A++++%3Fs+%3Fp+%3Fo%0A++%7D%0A

would you please solve this issue in jena library, since the loop starts within jena library as execConstruct or execConstructTriples  function is called. I am unable to handle it in my java code since looping start internally.   

Thanks.
Qaiser.

Re: Uncontrollable Loop with warning (WARN riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ')

Posted by Andy Seaborne <an...@apache.org>.
On 25/05/14 10:17, Andy Seaborne wrote:
> On 24/05/14 22:53, Qiaser Mehmood wrote:
>> Thank you Andy, here is sample code:
>>
>>   if(query.isConstructType()){
>>
>>               /* Model mdl=qe.execConstruct();
>>                StmtIterator stmt = mdl.listStatements();
>>
>>                while(stmt.hasNext()){
>>                stmt.next();
>>                }*/
>>
>>              Iterator<Triple> triples =qe.execConstructTriples();
>>
>>              while(triples.hasNext()){
>>              triples.next();
>>              }
>> }
>>
>> in both cases the construct functions start looping sending the continuos
>
> It's not continuous - it stops, it's just there are a lot of them. (It
> so happens I'm in the same country as Southampton and get good bandwidth
> to that server.)
>
>> warning and don't come to while loop where I can handle. So what
>> should be done to handle {W108} Not an XML Name: ' ' .
>
> That is because Jena reads, and parses the whole reply from the far end
> at the execConstructTriples point.

In implementation, it forks off a thread:

QueryEngineHTTP.execConstructTriples
->
QueryEngineHTTP.execTriples
->
RiotReader.createIteratorTriples


>
> It's a warning, turn it off or ignore it.
>   (turn off log4j.logger.org.apache.jena.riot)
>
> But the results look like they are junk anyway.  Ask the query with wget
> or curl to get the RDF/XML and edit it to be correct or ask a different
> query.  Or talk to the people who run that service (which is running
> 3Store).
>
>      Andy
>
>
>
>> Thanks.
>> Qaiser.
>>
>>
>> On Saturday, May 24, 2014 9:43 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>>
>>
>> On 24/05/14 20:17, Qiaser Mehmood wrote:
>>> Hey,
>>> I got a strange situation while querying an endpoint by running jena
>>> execConstruct (),  OR  execConstructTriples().
>>>
>>> The issue is uncontrollable loop
>>
>> Your query generates a large result - the whole database scaled up by
>> the number of classes.
>>
>> You don't show your code but I guess you are using the default results
>> format, which is RDF/XML, which is slow to parse.
>>
>> The total results from the remote are 91M. 2,226,435 lines of RDf/XML.
>> For only 742144 triples.
>>
>>> (WARN  riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ')
>>
>> The result returned by the endpoint are not correct.  (rdf:nodeID="" by
>> the look of it).  Luckily, it's a WARNing not an ERROR.   Unluckily,
>> it's sign of problem in the results
>>
>> All the bnodes are the same bnode because they have the same rdf:nodeID.
>>
>>
>>> by querying an endpoint http://roma.rkbexplorer.com/sparql/  and the
>>> actual query i use to run is
>>>
>>> PREFIX void: <http://rdfs.org/ns/void#>
>>> CONSTRUCT { <http://roma.rkbexplorer.com/sparql#dataset>
>> void:classPartition [ void:class ?c ; void:propertyPartition [
>> void:property ?p ] ] }
>>> WHERE { ?s a ?c ; ?p ?o .}
>>
>> which is the whole database multiplied by the number of classes.
>> Expensive and large.
>>
>> It looks like the remote endpoint does not do full duplicate suppression
>> in graph results (it isn't require to) in the interests of streaming.
>>
>> The graph returned is just 157 distinct triples.
>>
>>>
>>> the http query generated by jena
>>> QueryExecutionFactory.sparqlService(endpoint, query) is as follow:
>>> http://roma.rkbexplorer.com/sparql/?query=PREFIX++void%3A+%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E%0A%0ACONSTRUCT+%0A++%7B+%3Chttp%3A%2F%2Froma.rkbexplorer.com%2Fsparql%2F%23dataset%3E+void%3AclassPartition+_%3Ac0+.%0A++++_%3Ac0+void%3Aclass+%3Fc+.%0A++++_%3Ac0+void%3ApropertyPartition+_%3Ac1+.%0A++++_%3Ac1+void%3Aproperty+%3Fp+.%7D%0AWHERE%0A++%7B+%3Fs+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E+%3Fc+.%0A++++%3Fs+%3Fp+%3Fo%0A++%7D%0A
>>>
>>>
>>> would you please solve this issue in jena library, since the loop
>>> starts within jena library as execConstruct or execConstructTriples
>>> function is called. I am unable to handle it in my java code since
>>> looping start internally.
>>
>> Not a problem in Jena, which is just passing on your query to the remote
>> end and parsing the results.
>>
>>      Andy
>>
>>
>>>
>>> Thanks.
>>> Qaiser.
>>>
>


Re: Uncontrollable Loop with warning (WARN riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ')

Posted by Andy Seaborne <an...@apache.org>.
On 24/05/14 22:53, Qiaser Mehmood wrote:
> Thank you Andy, here is sample code:
>
>   if(query.isConstructType()){
>
>               /* Model mdl=qe.execConstruct();
>                StmtIterator stmt = mdl.listStatements();
>
>                while(stmt.hasNext()){
>                stmt.next();
>                }*/
>
>              Iterator<Triple> triples =qe.execConstructTriples();
>
>              while(triples.hasNext()){
>              triples.next();
>              }
> }
>
> in both cases the construct functions start looping sending the continuos

It's not continuous - it stops, it's just there are a lot of them. (It 
so happens I'm in the same country as Southampton and get good bandwidth 
to that server.)

> warning and don't come to while loop where I can handle. So what should be done to handle {W108} Not an XML Name: ' ' .

That is because Jena reads, and parses the whole reply from the far end 
at the execConstructTriples point.

It's a warning, turn it off or ignore it.
  (turn off log4j.logger.org.apache.jena.riot)

But the results look like they are junk anyway.  Ask the query with wget 
or curl to get the RDF/XML and edit it to be correct or ask a different 
query.  Or talk to the people who run that service (which is running 
3Store).

	Andy



> Thanks.
> Qaiser.
>
>
> On Saturday, May 24, 2014 9:43 PM, Andy Seaborne <an...@apache.org> wrote:
>
>
>
> On 24/05/14 20:17, Qiaser Mehmood wrote:
>> Hey,
>> I got a strange situation while querying an endpoint by running jena execConstruct (),  OR  execConstructTriples().
>>
>> The issue is uncontrollable loop
>
> Your query generates a large result - the whole database scaled up by
> the number of classes.
>
> You don't show your code but I guess you are using the default results
> format, which is RDF/XML, which is slow to parse.
>
> The total results from the remote are 91M. 2,226,435 lines of RDf/XML.
> For only 742144 triples.
>
>> (WARN  riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ')
>
> The result returned by the endpoint are not correct.  (rdf:nodeID="" by
> the look of it).  Luckily, it's a WARNing not an ERROR.   Unluckily,
> it's sign of problem in the results
>
> All the bnodes are the same bnode because they have the same rdf:nodeID.
>
>
>> by querying an endpoint http://roma.rkbexplorer.com/sparql/  and the actual query i use to run is
>>
>> PREFIX void: <http://rdfs.org/ns/void#>
>> CONSTRUCT { <http://roma.rkbexplorer.com/sparql#dataset>
> void:classPartition [ void:class ?c ; void:propertyPartition [
> void:property ?p ] ] }
>> WHERE { ?s a ?c ; ?p ?o .}
>
> which is the whole database multiplied by the number of classes.
> Expensive and large.
>
> It looks like the remote endpoint does not do full duplicate suppression
> in graph results (it isn't require to) in the interests of streaming.
>
> The graph returned is just 157 distinct triples.
>
>>
>> the http query generated by jena QueryExecutionFactory.sparqlService(endpoint, query) is as follow:
>> http://roma.rkbexplorer.com/sparql/?query=PREFIX++void%3A+%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E%0A%0ACONSTRUCT+%0A++%7B+%3Chttp%3A%2F%2Froma.rkbexplorer.com%2Fsparql%2F%23dataset%3E+void%3AclassPartition+_%3Ac0+.%0A++++_%3Ac0+void%3Aclass+%3Fc+.%0A++++_%3Ac0+void%3ApropertyPartition+_%3Ac1+.%0A++++_%3Ac1+void%3Aproperty+%3Fp+.%7D%0AWHERE%0A++%7B+%3Fs+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E+%3Fc+.%0A++++%3Fs+%3Fp+%3Fo%0A++%7D%0A
>>
>> would you please solve this issue in jena library, since the loop starts within jena library as execConstruct or execConstructTriples  function is called. I am unable to handle it in my java code since looping start internally.
>
> Not a problem in Jena, which is just passing on your query to the remote
> end and parsing the results.
>
>      Andy
>
>
>>
>> Thanks.
>> Qaiser.
>>


Re: Uncontrollable Loop with warning (WARN riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ')

Posted by Qiaser Mehmood <qm...@yahoo.com>.
Thank you Andy, here is sample code:

 if(query.isConstructType()){
              
             /* Model mdl=qe.execConstruct();
              StmtIterator stmt = mdl.listStatements();
              
              while(stmt.hasNext()){
              stmt.next();
              }*/
              
            Iterator<Triple> triples =qe.execConstructTriples();
            
            while(triples.hasNext()){
            triples.next();
            }
}
             
in both cases the construct functions start looping sending the continuos  warning and don't come to while loop where I can handle. So what should be done to handle {W108} Not an XML Name: ' ' .  

Thanks.
Qaiser.


On Saturday, May 24, 2014 9:43 PM, Andy Seaborne <an...@apache.org> wrote:
 


On 24/05/14 20:17, Qiaser Mehmood wrote:
> Hey,
> I got a strange situation while querying an endpoint by running jena execConstruct (),  OR  execConstructTriples().
>
> The issue is uncontrollable loop

Your query generates a large result - the whole database scaled up by 
the number of classes.

You don't show your code but I guess you are using the default results 
format, which is RDF/XML, which is slow to parse.

The total results from the remote are 91M. 2,226,435 lines of RDf/XML. 
For only 742144 triples.

> (WARN  riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ')

The result returned by the endpoint are not correct.  (rdf:nodeID="" by 
the look of it).  Luckily, it's a WARNing not an ERROR.   Unluckily, 
it's sign of problem in the results

All the bnodes are the same bnode because they have the same rdf:nodeID.


> by querying an endpoint http://roma.rkbexplorer.com/sparql/  and the actual query i use to run is
>
> PREFIX void: <http://rdfs.org/ns/void#>
> CONSTRUCT { <http://roma.rkbexplorer.com/sparql#dataset> 
void:classPartition [ void:class ?c ; void:propertyPartition [ 
void:property ?p ] ] }
> WHERE { ?s a ?c ; ?p ?o .}

which is the whole database multiplied by the number of classes. 
Expensive and large.

It looks like the remote endpoint does not do full duplicate suppression 
in graph results (it isn't require to) in the interests of streaming.

The graph returned is just 157 distinct triples.

>
> the http query generated by jena QueryExecutionFactory.sparqlService(endpoint, query) is as follow:
> http://roma.rkbexplorer.com/sparql/?query=PREFIX++void%3A+%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E%0A%0ACONSTRUCT+%0A++%7B+%3Chttp%3A%2F%2Froma.rkbexplorer.com%2Fsparql%2F%23dataset%3E+void%3AclassPartition+_%3Ac0+.%0A++++_%3Ac0+void%3Aclass+%3Fc+.%0A++++_%3Ac0+void%3ApropertyPartition+_%3Ac1+.%0A++++_%3Ac1+void%3Aproperty+%3Fp+.%7D%0AWHERE%0A++%7B+%3Fs+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E+%3Fc+.%0A++++%3Fs+%3Fp+%3Fo%0A++%7D%0A
>
> would you please solve this issue in jena library, since the loop starts within jena library as execConstruct or execConstructTriples  function is called. I am unable to handle it in my java code since looping start internally.

Not a problem in Jena, which is just passing on your query to the remote 
end and parsing the results.

    Andy


>
> Thanks.
> Qaiser.
>

Re: Uncontrollable Loop with warning (WARN riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ')

Posted by Andy Seaborne <an...@apache.org>.
On 24/05/14 20:17, Qiaser Mehmood wrote:
> Hey,
> I got a strange situation while querying an endpoint by running jena execConstruct (),  OR  execConstructTriples().
>
> The issue is uncontrollable loop

Your query generates a large result - the whole database scaled up by 
the number of classes.

You don't show your code but I guess you are using the default results 
format, which is RDF/XML, which is slow to parse.

The total results from the remote are 91M. 2,226,435 lines of RDf/XML. 
For only 742144 triples.

> (WARN  riot:77 - [line: 4, col: 41] {W108} Not an XML Name: ' ')

The result returned by the endpoint are not correct.  (rdf:nodeID="" by 
the look of it).  Luckily, it's a WARNing not an ERROR.   Unluckily, 
it's sign of problem in the results

All the bnodes are the same bnode because they have the same rdf:nodeID.


> by querying an endpoint http://roma.rkbexplorer.com/sparql/  and the actual query i use to run is
>
> PREFIX void: <http://rdfs.org/ns/void#>
 > CONSTRUCT { <http://roma.rkbexplorer.com/sparql#dataset> 
void:classPartition [ void:class ?c ; void:propertyPartition [ 
void:property ?p ] ] }
> WHERE { ?s a ?c ; ?p ?o .}

which is the whole database multiplied by the number of classes. 
Expensive and large.

It looks like the remote endpoint does not do full duplicate suppression 
in graph results (it isn't require to) in the interests of streaming.

The graph returned is just 157 distinct triples.

>
> the http query generated by jena QueryExecutionFactory.sparqlService(endpoint, query) is as follow:
> http://roma.rkbexplorer.com/sparql/?query=PREFIX++void%3A+%3Chttp%3A%2F%2Frdfs.org%2Fns%2Fvoid%23%3E%0A%0ACONSTRUCT+%0A++%7B+%3Chttp%3A%2F%2Froma.rkbexplorer.com%2Fsparql%2F%23dataset%3E+void%3AclassPartition+_%3Ac0+.%0A++++_%3Ac0+void%3Aclass+%3Fc+.%0A++++_%3Ac0+void%3ApropertyPartition+_%3Ac1+.%0A++++_%3Ac1+void%3Aproperty+%3Fp+.%7D%0AWHERE%0A++%7B+%3Fs+%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E+%3Fc+.%0A++++%3Fs+%3Fp+%3Fo%0A++%7D%0A
>
> would you please solve this issue in jena library, since the loop starts within jena library as execConstruct or execConstructTriples  function is called. I am unable to handle it in my java code since looping start internally.

Not a problem in Jena, which is just passing on your query to the remote 
end and parsing the results.

	Andy

>
> Thanks.
> Qaiser.
>