You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by Phillip Rhodes <mo...@gmail.com> on 2014/10/07 02:51:42 UTC

Jena Text returning bogus results?

Jena gang:

I'm trying to use the Jena-Text stuff to do text searches in SPARQL
and am running into a problem.  I am indexing two triples, with an
rdfs:label property, then doing a search using text:query that should
match 0 triples.  But I still get back two results when I run my query
(the two triples I previously indexed, even though they don't contain
the query string).

I'm sure I'm probably just doing something wrong, but so far I'm
having no luck figuring out what it is.  If somebody could look at
this code and give me a pointer, it would be much appreciated.


class JenaTextMain1
{

    static main(args)
    {

        // Base dataset
        Dataset dataset = TDBFactory.createDataset("jenastore");

        EntityDefinition entDef = new EntityDefinition("uri", "text",
RDFS.label) ;

        // Lucene, in memory.
        Directory dir =  new RAMDirectory();

        // Join together into a dataset
        Dataset ds = TextDatasetFactory.createLucene(dataset, dir, entDef);

        ds.begin(ReadWrite.WRITE);

        Model m = ds.defaultModel;

        Resource rSubject = m.createResource(
"http://ontology.fogbeam.com/example/TestResource1" );
        Resource rSubject2 = m.createResource(
"http://ontology.fogbeam.com/example/TestResource2" );

        try
        {

            Statement s = m.createStatement(rSubject, RDFS.label,
"This is a TEST Resource" );

            m.add( s );

            Statement s2 = m.createStatement(rSubject2, RDFS.label,
"Bratwurst" );

            m.add( s2 );

            ds.commit();


            String baseQueryString =
            "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
            "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> " +
            "PREFIX dc: <http://purl.org/dc/elements/1.1/> " +
            "PREFIX dcterm: <http://purl.org/dc/terms/> " +
            "PREFIX owl: <http://www.w3.org/2002/07/owl#> " +
            "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
            "PREFIX text: <http://jena.apache.org/text#>";

            /* Do a SPARQL query using Jena-Text here... */
            String queryString = baseQueryString + "SELECT ?s { ?s
text:query ('Flibble') ; rdfs:label ?label ; }";

            ds.begin(ReadWrite.READ);

            Query query = QueryFactory.create(queryString) ;

            Reasoner reasoner = ReasonerRegistry.getOWLMiniReasoner();
            InfModel inf = ModelFactory.createInfModel(reasoner, m);

            QueryExecution qexec = QueryExecutionFactory.create(query, inf);


            try
            {
                ResultSet solutions = qexec.execSelect();
                for ( ; solutions.hasNext() ; )
                {
                    QuerySolution soln = solutions.nextSolution();
                    println "solution: ${soln}";
                    Iterator iter = soln.varNames();
                }

                ds.commit();

            }
            finally
            {
                qexec.close();
            }
        }
        finally
        {
            if( ds != null )
            {
                ds.end();
            }
        }

        println "done";
    }

}


Running this results in this output:

solution: ( ?s = <http://ontology.fogbeam.com/example/TestResource1> )
solution: ( ?s = <http://ontology.fogbeam.com/example/TestResource2> )
done


Any and all help is greatly appreciated.

Also, on a related note... the page here:

https://jena.apache.org/documentation/query/text-query.html

has the following link listed as "example code here" but the link is
no longer valid.

https://svn.apache.org/repos/asf/jena/trunk/jena-text/src/main/java/examples/


Thanks,


Phil

This message optimized for indexing by NSA PRISM

Re: Jena Text returning bogus results?

Posted by Phillip Rhodes <mo...@gmail.com>.
Thanks Bruno! I did find the example Java code in Git and literally
like 2 minutes ago managed to get this working.  I still don't
*entirely* understand every aspect of how this works, but it's slowly
starting to make sense.

I'm going to play with it some more, and maybe I can write up some
docs on this and contribute to the Jena documentation.


Cheers,


Phil
This message optimized for indexing by NSA PRISM


On Mon, Oct 6, 2014 at 11:08 PM, Bruno P. Kinoshita
<br...@yahoo.com.br> wrote:
> Hello Phillip
> I haven't used Jena-Text before, but at a customer we are deploying Jena and a Hadoop cluster that comes with a Solr server, and I have been wanting to learn how to use jena-text to see if that'll be useful in our project.
> It took me some time to find out what was different in your example, from the one provided in jena-text. This gist [1] has my Java code ported from reading your example.
> Take a look at line 77, instead of using the dataset with lucene support, you're using a model with a reasoner, and my guess (which might be a long shot) is that this model is resolving your query to true to every entry, due to it not recognizing the text:query () part.
> Regarding the link, it is broken due to the migration from SVN to git, that happened days ago. It has already been reported in JENA-786 [2], I already ran  the W3C link checker tool [3] with recursion=20 to get some 404 errors, and will see if I can write a simple patch that fixes some of them.
> Oh, take a look at QueryExecUtils#executeQuery(...), it's handy for quickly testing queries and printing the output.
> Hope that helps,Bruno
> [1] https://gist.github.com/kinow/10875c79a94f4fd931c9
> [2] https://issues.apache.org/jira/browse/JENA-786?jql=project%20%3D%20JENA%20AND%20updated%3E%3D-1w%20ORDER%20BY%20updated%20DESC
>
> [3] http://validator.w3.org/checklink
>
>       From: Phillip Rhodes <mo...@gmail.com>
>  To: dev@jena.apache.org
>  Sent: Monday, October 6, 2014 9:51 PM
>  Subject: Jena Text returning bogus results?
>
> Jena gang:
>
> I'm trying to use the Jena-Text stuff to do text searches in SPARQL
> and am running into a problem.  I am indexing two triples, with an
> rdfs:label property, then doing a search using text:query that should
> match 0 triples.  But I still get back two results when I run my query
> (the two triples I previously indexed, even though they don't contain
> the query string).
>
> I'm sure I'm probably just doing something wrong, but so far I'm
> having no luck figuring out what it is.  If somebody could look at
> this code and give me a pointer, it would be much appreciated.
>
>
> class JenaTextMain1
> {
>
>     static main(args)
>     {
>
>         // Base dataset
>         Dataset dataset = TDBFactory.createDataset("jenastore");
>
>         EntityDefinition entDef = new EntityDefinition("uri", "text",
> RDFS.label) ;
>
>         // Lucene, in memory.
>         Directory dir =  new RAMDirectory();
>
>         // Join together into a dataset
>         Dataset ds = TextDatasetFactory.createLucene(dataset, dir, entDef);
>
>         ds.begin(ReadWrite.WRITE);
>
>         Model m = ds.defaultModel;
>
>         Resource rSubject = m.createResource(
> "http://ontology.fogbeam.com/example/TestResource1" );
>         Resource rSubject2 = m.createResource(
> "http://ontology.fogbeam.com/example/TestResource2" );
>
>         try
>         {
>
>             Statement s = m.createStatement(rSubject, RDFS.label,
> "This is a TEST Resource" );
>
>             m.add( s );
>
>             Statement s2 = m.createStatement(rSubject2, RDFS.label,
> "Bratwurst" );
>
>             m.add( s2 );
>
>             ds.commit();
>
>
>             String baseQueryString =
>             "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
>             "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> " +
>             "PREFIX dc: <http://purl.org/dc/elements/1.1/> " +
>             "PREFIX dcterm: <http://purl.org/dc/terms/> " +
>             "PREFIX owl: <http://www.w3.org/2002/07/owl#> " +
>             "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
>             "PREFIX text: <http://jena.apache.org/text#>";
>
>             /* Do a SPARQL query using Jena-Text here... */
>             String queryString = baseQueryString + "SELECT ?s { ?s
> text:query ('Flibble') ; rdfs:label ?label ; }";
>
>             ds.begin(ReadWrite.READ);
>
>             Query query = QueryFactory.create(queryString) ;
>
>             Reasoner reasoner = ReasonerRegistry.getOWLMiniReasoner();
>             InfModel inf = ModelFactory.createInfModel(reasoner, m);
>
>             QueryExecution qexec = QueryExecutionFactory.create(query, inf);
>
>
>             try
>             {
>                 ResultSet solutions = qexec.execSelect();
>                 for ( ; solutions.hasNext() ; )
>                 {
>                     QuerySolution soln = solutions.nextSolution();
>                     println "solution: ${soln}";
>                     Iterator iter = soln.varNames();
>                 }
>
>                 ds.commit();
>
>             }
>             finally
>             {
>                 qexec.close();
>             }
>         }
>         finally
>         {
>             if( ds != null )
>             {
>                 ds.end();
>             }
>         }
>
>         println "done";
>     }
>
> }
>
>
> Running this results in this output:
>
> solution: ( ?s = <http://ontology.fogbeam.com/example/TestResource1> )
> solution: ( ?s = <http://ontology.fogbeam.com/example/TestResource2> )
> done
>
>
> Any and all help is greatly appreciated.
>
> Also, on a related note... the page here:
>
> https://jena.apache.org/documentation/query/text-query.html
>
> has the following link listed as "example code here" but the link is
> no longer valid.
>
> https://svn.apache.org/repos/asf/jena/trunk/jena-text/src/main/java/examples/
>
>
> Thanks,
>
>
> Phil
>
> This message optimized for indexing by NSA PRISM
>
>
>

Re: Jena Text returning bogus results?

Posted by "Bruno P. Kinoshita" <br...@yahoo.com.br>.
Hello Phillip
I haven't used Jena-Text before, but at a customer we are deploying Jena and a Hadoop cluster that comes with a Solr server, and I have been wanting to learn how to use jena-text to see if that'll be useful in our project.
It took me some time to find out what was different in your example, from the one provided in jena-text. This gist [1] has my Java code ported from reading your example. 
Take a look at line 77, instead of using the dataset with lucene support, you're using a model with a reasoner, and my guess (which might be a long shot) is that this model is resolving your query to true to every entry, due to it not recognizing the text:query () part. 
Regarding the link, it is broken due to the migration from SVN to git, that happened days ago. It has already been reported in JENA-786 [2], I already ran  the W3C link checker tool [3] with recursion=20 to get some 404 errors, and will see if I can write a simple patch that fixes some of them.
Oh, take a look at QueryExecUtils#executeQuery(...), it's handy for quickly testing queries and printing the output.
Hope that helps,Bruno
[1] https://gist.github.com/kinow/10875c79a94f4fd931c9
[2] https://issues.apache.org/jira/browse/JENA-786?jql=project%20%3D%20JENA%20AND%20updated%3E%3D-1w%20ORDER%20BY%20updated%20DESC

[3] http://validator.w3.org/checklink
 
      From: Phillip Rhodes <mo...@gmail.com>
 To: dev@jena.apache.org 
 Sent: Monday, October 6, 2014 9:51 PM
 Subject: Jena Text returning bogus results?
   
Jena gang:

I'm trying to use the Jena-Text stuff to do text searches in SPARQL
and am running into a problem.  I am indexing two triples, with an
rdfs:label property, then doing a search using text:query that should
match 0 triples.  But I still get back two results when I run my query
(the two triples I previously indexed, even though they don't contain
the query string).

I'm sure I'm probably just doing something wrong, but so far I'm
having no luck figuring out what it is.  If somebody could look at
this code and give me a pointer, it would be much appreciated.


class JenaTextMain1
{

    static main(args)
    {

        // Base dataset
        Dataset dataset = TDBFactory.createDataset("jenastore");

        EntityDefinition entDef = new EntityDefinition("uri", "text",
RDFS.label) ;

        // Lucene, in memory.
        Directory dir =  new RAMDirectory();

        // Join together into a dataset
        Dataset ds = TextDatasetFactory.createLucene(dataset, dir, entDef);

        ds.begin(ReadWrite.WRITE);

        Model m = ds.defaultModel;

        Resource rSubject = m.createResource(
"http://ontology.fogbeam.com/example/TestResource1" );
        Resource rSubject2 = m.createResource(
"http://ontology.fogbeam.com/example/TestResource2" );

        try
        {

            Statement s = m.createStatement(rSubject, RDFS.label,
"This is a TEST Resource" );

            m.add( s );

            Statement s2 = m.createStatement(rSubject2, RDFS.label,
"Bratwurst" );

            m.add( s2 );

            ds.commit();


            String baseQueryString =
            "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> " +
            "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> " +
            "PREFIX dc: <http://purl.org/dc/elements/1.1/> " +
            "PREFIX dcterm: <http://purl.org/dc/terms/> " +
            "PREFIX owl: <http://www.w3.org/2002/07/owl#> " +
            "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> " +
            "PREFIX text: <http://jena.apache.org/text#>";

            /* Do a SPARQL query using Jena-Text here... */
            String queryString = baseQueryString + "SELECT ?s { ?s
text:query ('Flibble') ; rdfs:label ?label ; }";

            ds.begin(ReadWrite.READ);

            Query query = QueryFactory.create(queryString) ;

            Reasoner reasoner = ReasonerRegistry.getOWLMiniReasoner();
            InfModel inf = ModelFactory.createInfModel(reasoner, m);

            QueryExecution qexec = QueryExecutionFactory.create(query, inf);


            try
            {
                ResultSet solutions = qexec.execSelect();
                for ( ; solutions.hasNext() ; )
                {
                    QuerySolution soln = solutions.nextSolution();
                    println "solution: ${soln}";
                    Iterator iter = soln.varNames();
                }

                ds.commit();

            }
            finally
            {
                qexec.close();
            }
        }
        finally
        {
            if( ds != null )
            {
                ds.end();
            }
        }

        println "done";
    }

}


Running this results in this output:

solution: ( ?s = <http://ontology.fogbeam.com/example/TestResource1> )
solution: ( ?s = <http://ontology.fogbeam.com/example/TestResource2> )
done


Any and all help is greatly appreciated.

Also, on a related note... the page here:

https://jena.apache.org/documentation/query/text-query.html

has the following link listed as "example code here" but the link is
no longer valid.

https://svn.apache.org/repos/asf/jena/trunk/jena-text/src/main/java/examples/


Thanks,


Phil

This message optimized for indexing by NSA PRISM