You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Andy Seaborne <an...@apache.org> on 2013/01/01 18:15:14 UTC
Re: querying TDB by program
On 01/01/13 16:20, Jean-Marc Vanel wrote:
> So, I created a new database with apache-jena 2.7.4 .
>
> 2012/12/31 Andy Seaborne <an...@apache.org>
>
>> On 31/12/12 12:01, Jean-Marc Vanel wrote:
>>
>>> And when I did that , I found that, apparently, TDB 0.9.4 inside EulerGUI
>>> is not able to read a directory made with latest TDB from SVN (the query
>>> result is empty).
>>>
>>
>> Do you have a test case for this?
>> (what's the query,
>
>
> SELECT DISTINCT ?p WHERE {?s ?p ?o} LIMIT 200
>
> what's the data?)
>
>
> I loaded this from dbPedia 3.8 :
> $HOME/src/apache-jena-2.7.4/bin/tdbloader2 --loc ../tdb74 labels_en.ttl
Not your fault but this is depressingly full of bad URIs. I wish they
would %-encode carefully to standardise the fixup of IRIs like the (much
better) .nt file.
(Yes - I know many tools ignore the issues but if they input is suspect,
how can we create reliable output on the web?)
>
> The code was :
> Dataset dataset = TDBFactory.createDataset(service.substring(
> 5 ));
> Model db = dataset.getDefaultModel();
> queryExecution = QueryExecutionFactory.create(query, db );
Aside: This is not not the best way because it does not directly go the
TDB store.
>
> also tried :
> Dataset dataset = TDBFactory.createDataset(service.substring(
> 5 ));
> queryExecution = QueryExecutionFactory.create(query, dataset );
Aside: This is better. It goes straight to TDB.
>
> then in both cases :
> final ResultSet result = queryExecution.execSelect();
>
> but : result.getRowNumber() == 0 ! :( .
See javadoc:
/** Return the "row" number for the current iterator item */
public int getRowNumber() ;
as you have not iterated over the results, you are at row 0.
getRowNumber is the *current* row, not the count.
Try:
int x = ResultSetFormatter.consume(result)
>
> However : db.size() == (long) 9442421
>
> And tdbquery gives :
>
> ------------------------------------------------
> | p |
> ================================================
> | <http://www.w3.org/2000/01/rdf-schema#label> |
> ------------------------------------------------
>
> This was tested with EulerGUI code, whose Maven dependencies are :
>
> <dependency>
> <groupId>org.apache.jena</groupId>
> <artifactId>jena-arq</artifactId>
> <version>2.9.4</version>
> </dependency>
>
> <dependency>
> <groupId>org.apache.jena</groupId>
> <artifactId>jena-tdb</artifactId>
> <version>0.9.4</version>
> </dependency>
>
> I understand that I should replace these with a single dependency , but I
> don't think this is the cause of my problem :
It's not.
>
> <dependency>
> <groupId>org.apache.jena</groupId>
> <artifactId>apache-jena</artifactId>
> <version>2.7.4</version>
<type>pom</type>
> </dependency>
There is no jar there (currently) - it's a POM of the modules.
>
>
>>
>> Andy
>>
>
>
>
Re: querying TDB by program
Posted by Jean-Marc Vanel <je...@gmail.com>.
2013/1/1 Andy Seaborne <an...@apache.org>
>
>> but : result.getRowNumber() == 0 ! :( .
>>
>
> See javadoc:
>
> /** Return the "row" number for the current iterator item */
> public int getRowNumber() ;
>
> as you have not iterated over the results, you are at row 0.
>
> getRowNumber is the *current* row, not the count.
>
OK. That was just an attempt to debug.
But after that I tried to make use of it, but it seems that for ARQ queries
on remove SPARQL, result.getRowNumber() starts at 2 .
> Try:
>
> int x = ResultSetFormatter.consume(**result)
>
This helped me to learn that ResultSet is iterable just once.
That was my problem, because I used it once to record the columns datatypes
on first row, and then for the data rows.
Thanks for for answers .
--
Jean-Marc Vanel
Déductions SARL - Consulting, services, training,
Rule-based programming, Semantic Web
http://deductions-software.com/
+33 (0)6 89 16 29 52
Twitter: @jmvanel ; chat: irc://irc.freenode.net#eulergui