You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Peter Jungen <el...@gmx.de> on 2011/11/09 16:16:21 UTC

verify load successfull

Hello dear jena team,
I have loaded a dump (nquads format), numerous files, into my TDB using
tdbloader2.
How do I verify the loading of the dump was indeed completly successfull?
regards
Pete

Re: verify load successfull

Posted by Andy Seaborne <an...@apache.org>.
On 10/11/11 09:47, Damian Steer wrote:
>
> On 10 Nov 2011, at 09:34, Andy Seaborne wrote:
>
>> After loading I do
>>
>> SELECT (count(*) as ?c)
>> { { ?s ?p ?o } UNION { GRAPH ?s { ?s ?p ?o } } }
>
>
> Typo:
>
> ... GRAPH ?g { ?s ...
>
> Damian

And there was I counting the {}!

	Andy


Re: verify load successfull

Posted by Damian Steer <d....@bristol.ac.uk>.
On 10 Nov 2011, at 09:34, Andy Seaborne wrote:

> After loading I do
> 
> SELECT (count(*) as ?c)
> { { ?s ?p ?o } UNION { GRAPH ?s { ?s ?p ?o } } }


Typo:

... GRAPH ?g { ?s ...

Damian

Re: verify load successfull

Posted by Andy Seaborne <an...@apache.org>.
Firstly, I run "riot --validate" on the data before loading.

After loading I do

SELECT (count(*) as ?c)
{ { ?s ?p ?o } UNION { GRAPH ?s { ?s ?p ?o } } }

the key is it goes to the end of the SPO index (hence checking the end 
of the load happened) and it is reasonably quick.

note that's the triple count, and if you have duplicates in the data 
then it will be less than wc -l of the nquads.

and I do a

SELECT * { <uri> ?p ?o }

for a <uri> known to be a subject.

	Andy

On 09/11/11 23:59, Paolo Castagna wrote:
> Peter Jungen wrote:
>> Hello dear jena team,
>> I have loaded a dump (nquads format), numerous files, into my TDB using
>> tdbloader2.
>> How do I verify the loading of the dump was indeed completly successfull?
>> regards
>> Pete
>
> Hi Pete,
> I usually use tdbdump. See tdbdump --help:
>
> tdbdump : Write N-Quads to stdout
> Location
> --loc=DIR Location (a directory)
> --tdb= Assembler description file
> Symbol definition
> --set Set a configuration symbol to a value
> --strict Operate in strict SPARQL mode (no extensions of any kind)
> --desc= Assembler description file
> General
> -v --verbose Verbose
> -q --quiet Run with minimal output
> --debug Output information for debugging
> --help
> --version Version information
>
> You can also load stuff with tdbloader, dump that out, sort it and diff
> with
> what you get from tdbloader2 (there should be no differences, but bnodes).
>
> A small TDB "verifier" is here:
> https://github.com/castagna/tdbloader3/blob/master/src/test/java/dev/TDBVerifier.java
>
> It's not an "official" command for TDB and it's just a quick ack, but
> you could
> help improving it and contribute back to TDB as a command to check the
> integrity
> of an index and help people to understand if an index is corrupted, why
> it is
> corrupted.
>
> Hope this helps.
>
> Paolo


Re: verify load successfull

Posted by Paolo Castagna <ca...@googlemail.com>.
Peter Jungen wrote:
> Hello dear jena team,
> I have loaded a dump (nquads format), numerous files, into my TDB using
> tdbloader2.
> How do I verify the loading of the dump was indeed completly successfull?
> regards
> Pete

Hi Pete,
I usually use tdbdump. See tdbdump --help:

tdbdump : Write N-Quads to stdout
   Location
       --loc=DIR              Location (a directory)
       --tdb=                 Assembler description file
   Symbol definition
       --set                  Set a configuration symbol to a value
       --strict               Operate in strict SPARQL mode (no extensions of 
any kind)
       --desc=                Assembler description file
   General
       -v   --verbose         Verbose
       -q   --quiet           Run with minimal output
       --debug                Output information for debugging
       --help
       --version              Version information

You can also load stuff with tdbloader, dump that out, sort it and diff with
what you get from tdbloader2 (there should be no differences, but bnodes).

A small TDB "verifier" is here:
https://github.com/castagna/tdbloader3/blob/master/src/test/java/dev/TDBVerifier.java
It's not an "official" command for TDB and it's just a quick ack, but you could
help improving it and contribute back to TDB as a command to check the integrity
of an index and help people to understand if an index is corrupted, why it is
corrupted.

Hope this helps.

Paolo