You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by Brad Moran <bm...@pinnacle21.net> on 2013/08/21 22:45:24 UTC

Creating Dataset from loaded TDB and text index

At this point I have a loaded TDB and text index, now I am trying to query
using java. First I tried creating a dataset directly from the TDB but
received

Dataset ds = TDBFactory.createDataset(DBDirectory); <--How I created Dataset

WARN  o.apache.jena.query.text.TextQueryPF - Failed to find the text index
: tried context and as a text-enabled dataset
WARN  o.apache.jena.query.text.TextQueryPF - No text index - no text search
performed
and returns an empty resultSet

I am trying to run this query:

QueryExecution qExec = QueryExecutionFactory.create(
                        "PREFIX text: <http://jena.apache.org/text#> PREFIX
mms: <http://rdf.cdisc.org/mms#> "
                        + "SELECT * WHERE{?s text:query
(mms:dataElementName 'AEACN')}", ds);

ResultSet rs = qExec.execSelect();

So I figured the problem could be that I some how need to combine the TDB
and text index into the same dataset, I tried:

String DBDirectory = "tdb";
String indexDir = "luceneIndexes";
File file = new File(indexDir);
Directory dir = FSDirectory.open(file);
TextIndex index = new TextIndexLucene(dir, null);// need to add the
EntityDefinition?
Dataset ds = TDBFactory.createDataset(DBDirectory);
Dataset dataset = TextDatasetFactory.create(ds, index);

This query does not run because of a nullPointerException. I am not sure if
this is the right way to go about this. If this is the right way to combine
a TDB and text index, is there an easy way to get the EntityDefinition from
the text index?

Re: Creating Dataset from loaded TDB and text index

Posted by Brad Moran <bm...@pinnacle21.net>.

I am not sure what I should be using as the second parameter for
DatasetFactory.assemble. Should I use the URI provided in the
documentation, or do I use the location of my own text dataset (my
luceneIndex?). Either way I receive an error.

If I use the URI provided, I get:
com.hp.hpl.jena.assembler.exceptions.AssemblerException: caught: Failed to
open:
/Users/brad/NetBeansProjects/mdr-older/trunk/NetBeansProjects/mdr-older/trunk/tdb/node2id.idn
(mode=rw)
  doing:
    root:
file:///Users/brad/NetBeansProjects/mdr-older/trunk/data.ttl#dataset with
type: http://jena.hpl.hp.com/2008/tdb#DatasetTDB assembler class: class
com.hp.hpl.jena.tdb.assembler.DatasetAssemblerTDB
    root: http://localhost/jena_example/#text_dataset with type:
http://jena.apache.org/text#TextDataset assembler class: class
org.apache.jena.query.text.assembler.TextDatasetAssembler


On Thu, Aug 22, 2013 at 2:14 AM, Andy Seaborne <an...@apache.org> wrote:

> On 21/08/13 21:45, Brad Moran wrote:
>
>> At this point I have a loaded TDB and text index, now I am trying to query
>> using java. First I tried creating a dataset directly from the TDB but
>> received
>>
>> Dataset ds = TDBFactory.createDataset(**DBDirectory); <--How I created
>> Dataset
>>
>> WARN  o.apache.jena.query.text.**TextQueryPF - Failed to find the text
>> index
>> : tried context and as a text-enabled dataset
>> WARN  o.apache.jena.query.text.**TextQueryPF - No text index - no text
>> search
>> performed
>> and returns an empty resultSet
>>
>> I am trying to run this query:
>>
>> QueryExecution qExec = QueryExecutionFactory.create(
>>                          "PREFIX text: <http://jena.apache.org/text#>
>> PREFIX
>> mms: <http://rdf.cdisc.org/mms#> "
>>                          + "SELECT * WHERE{?s text:query
>> (mms:dataElementName 'AEACN')}", ds);
>>
>> ResultSet rs = qExec.execSelect();
>>
>> So I figured the problem could be that I some how need to combine the TDB
>> and text index into the same dataset, I tried:
>>
>> String DBDirectory = "tdb";
>> String indexDir = "luceneIndexes";
>> File file = new File(indexDir);
>> Directory dir = FSDirectory.open(file);
>> TextIndex index = new TextIndexLucene(dir, null);// need to add the
>> EntityDefinition?
>>
>
> Yes.
>
>
>  Dataset ds = TDBFactory.createDataset(**DBDirectory);
>> Dataset dataset = TextDatasetFactory.create(ds, index);
>>
>
> then you will need to use 'dataset' not 'ds' to query.
>
>
>  This query does not run because of a nullPointerException.
>>
>
> stacktrace?
>
>
>  I am not sure if
>> this is the right way to go about this. If this is the right way to
>> combine
>> a TDB and text index, is there an easy way to get the EntityDefinition
>> from
>> the text index?
>>
>
> You can use the assembler file to get the dataset.
> (see the text search documentation)
>
> Else
> EntityDefinition entDef = new EntityDefinition("uri", "text", RDFS.label) ;
> and add the further declarations as in the assembler file.
>
> The assembler means you can keep the configuation in one place
>
>         Andy
>
>

Re: Creating Dataset from loaded TDB and text index

Posted by Brad Moran <bm...@pinnacle21.net>.

Ok this issue seems to be resolved. I had to change the location of the tdb
in the assembler file. NetBeans only needed the tdb location from the
project folder where as from bash I needed the full file system location.
Thanks again for all your help!


On Thu, Aug 22, 2013 at 12:23 PM, Andy Seaborne <an...@apache.org> wrote:

> On 22/08/13 17:04, Brad Moran wrote:
>
>> I am not sure what I should be using as the second parameter for
>> DatasetFactory.assemble. Should I use the URI provided in the
>> documentation, or do I use the location of my own text dataset (my
>> luceneIndex?). Either way I receive an error.
>>
>> If I use the URI provided, I get:
>>
>> com.hp.hpl.jena.assembler.**exceptions.AssemblerException: caught:
>> Failed to
>> open:
>> /Users/brad/NetBeansProjects/**mdr-older/trunk/**
>> NetBeansProjects/mdr-older/**trunk/tdb/node2id.idn
>> (mode=rw)
>>
>
> Why are you opening a specific file in the database directory?
>
> I can not understand where that URI came from except it looks like an
> incorrectly
>
> At a guess, something says:
>
> file:NetBeansProjects/mdr-**older/trunk/tdb/node2id.idn
>
> which is a relative file name, you need to start with a file:///
>
> I have no idea how the node2id.idn got there.
>
> If you open this file directly, you may corrupt the database.
>
>
>     doing:
>>      root:
>> file:///Users/brad/**NetBeansProjects/mdr-older/**trunk/data.ttl#dataset
>> with
>>      type:http://jena.hpl.hp.com/**2008/tdb#DatasetTDB<http://jena.hpl.hp.com/2008/tdb#DatasetTDB>assembler class: class
>> com.hp.hpl.jena.tdb.assembler.**DatasetAssemblerTDB
>>      root: http://localhost/jena_example/**#text_dataset<http://localhost/jena_example/#text_dataset>with type:
>> http://jena.apache.org/text#**TextDatasetassembler<http://jena.apache.org/text#TextDatasetassembler>class: class
>> org.apache.jena.query.text.**assembler.TextDatasetAssembler
>>
>>
>> Otherwise if I use the location of lucene index on my file system I get:
>>
>> com.hp.hpl.jena.assembler.**exceptions.**NoSpecificTypeException: the
>> root
>> luceneIndexes has no most specific type that is a subclass of ja:Object
>>
>>
> http://jena.apache.org/**documentation/query/text-**
> query.html#text-dataset-**assembler<http://jena.apache.org/documentation/query/text-query.html#text-dataset-assembler>
>
> Dataset ds = DatasetFactory.assemble(
>     "text-config.ttl",
>     "http://localhost/jena_**example/#text_dataset<http://localhost/jena_example/#text_dataset>")
> ;
>
> and http://localhost/jena_example/**#text_dataset<http://localhost/jena_example/#text_dataset>is the URi used in the assembler file.  In the example it's
>
> @prefix :        <http://localhost/jena_**example/#<http://localhost/jena_example/#>>
> .
> ....
> :text_dataset rdf:type     text:TextDataset ;
>     text:dataset   <#dataset> ;
>     text:index     <#indexLucene> ;
>     .
>
> so "http://localhost/jena_**example/#text_dataset<http://localhost/jena_example/#text_dataset>".
>  This happens to be the same as in your assembler file of 20/Aug.
>
> Please provide a complete, minimal example of what you are doing.
>
>         Andy
>
>
>
>> On Thu, Aug 22, 2013 at 2:14 AM, Andy Seaborne <an...@apache.org> wrote:
>>
>>  On 21/08/13 21:45, Brad Moran wrote:
>>>
>>>  At this point I have a loaded TDB and text index, now I am trying to
>>>> query
>>>> using java. First I tried creating a dataset directly from the TDB but
>>>> received
>>>>
>>>> Dataset ds = TDBFactory.createDataset(****DBDirectory); <--How I
>>>> created
>>>> Dataset
>>>>
>>>> WARN  o.apache.jena.query.text.****TextQueryPF - Failed to find the
>>>> text
>>>>
>>>> index
>>>> : tried context and as a text-enabled dataset
>>>> WARN  o.apache.jena.query.text.****TextQueryPF - No text index - no
>>>> text
>>>>
>>>> search
>>>> performed
>>>> and returns an empty resultSet
>>>>
>>>> I am trying to run this query:
>>>>
>>>> QueryExecution qExec = QueryExecutionFactory.create(
>>>>                           "PREFIX text: <http://jena.apache.org/text#>
>>>> PREFIX
>>>> mms: <http://rdf.cdisc.org/mms#> "
>>>>                           + "SELECT * WHERE{?s text:query
>>>> (mms:dataElementName 'AEACN')}", ds);
>>>>
>>>> ResultSet rs = qExec.execSelect();
>>>>
>>>> So I figured the problem could be that I some how need to combine the
>>>> TDB
>>>> and text index into the same dataset, I tried:
>>>>
>>>> String DBDirectory = "tdb";
>>>> String indexDir = "luceneIndexes";
>>>> File file = new File(indexDir);
>>>> Directory dir = FSDirectory.open(file);
>>>> TextIndex index = new TextIndexLucene(dir, null);// need to add the
>>>> EntityDefinition?
>>>>
>>>>
>>> Yes.
>>>
>>>
>>>   Dataset ds = TDBFactory.createDataset(****DBDirectory);
>>>
>>>  Dataset dataset = TextDatasetFactory.create(ds, index);
>>>>
>>>>
>>> then you will need to use 'dataset' not 'ds' to query.
>>>
>>>
>>>   This query does not run because of a nullPointerException.
>>>
>>>>
>>>>
>>> stacktrace?
>>>
>>>
>>>   I am not sure if
>>>
>>>> this is the right way to go about this. If this is the right way to
>>>> combine
>>>> a TDB and text index, is there an easy way to get the EntityDefinition
>>>> from
>>>> the text index?
>>>>
>>>>
>>> You can use the assembler file to get the dataset.
>>> (see the text search documentation)
>>>
>>> Else
>>> EntityDefinition entDef = new EntityDefinition("uri", "text",
>>> RDFS.label) ;
>>> and add the further declarations as in the assembler file.
>>>
>>> The assembler means you can keep the configuation in one place
>>>
>>>          Andy
>>>
>>>
>>>
>>
>

Re: Creating Dataset from loaded TDB and text index

Posted by Andy Seaborne <an...@apache.org>.

On 22/08/13 17:04, Brad Moran wrote:
> I am not sure what I should be using as the second parameter for
> DatasetFactory.assemble. Should I use the URI provided in the
> documentation, or do I use the location of my own text dataset (my
> luceneIndex?). Either way I receive an error.
>
> If I use the URI provided, I get:
>
> com.hp.hpl.jena.assembler.exceptions.AssemblerException: caught: Failed to
> open:
> /Users/brad/NetBeansProjects/mdr-older/trunk/NetBeansProjects/mdr-older/trunk/tdb/node2id.idn
> (mode=rw)

Why are you opening a specific file in the database directory?

I can not understand where that URI came from except it looks like an 
incorrectly

At a guess, something says:

file:NetBeansProjects/mdr-older/trunk/tdb/node2id.idn

which is a relative file name, you need to start with a file:///

I have no idea how the node2id.idn got there.

If you open this file directly, you may corrupt the database.

>    doing:
>      root:
> file:///Users/brad/NetBeansProjects/mdr-older/trunk/data.ttl#dataset with
>      type:http://jena.hpl.hp.com/2008/tdb#DatasetTDB assembler class: class
> com.hp.hpl.jena.tdb.assembler.DatasetAssemblerTDB
>      root: http://localhost/jena_example/#text_dataset with type:
> http://jena.apache.org/text#TextDatasetassembler class: class
> org.apache.jena.query.text.assembler.TextDatasetAssembler
>
>
> Otherwise if I use the location of lucene index on my file system I get:
>
> com.hp.hpl.jena.assembler.exceptions.NoSpecificTypeException: the root
> luceneIndexes has no most specific type that is a subclass of ja:Object
>

http://jena.apache.org/documentation/query/text-query.html#text-dataset-assembler

Dataset ds = DatasetFactory.assemble(
     "text-config.ttl",
     "http://localhost/jena_example/#text_dataset") ;

and http://localhost/jena_example/#text_dataset is the URi used in the 
assembler file.  In the example it's

@prefix :        <http://localhost/jena_example/#> .
....
:text_dataset rdf:type     text:TextDataset ;
     text:dataset   <#dataset> ;
     text:index     <#indexLucene> ;
     .

so "http://localhost/jena_example/#text_dataset".  This happens to be 
the same as in your assembler file of 20/Aug.

Please provide a complete, minimal example of what you are doing.

	Andy


>
> On Thu, Aug 22, 2013 at 2:14 AM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 21/08/13 21:45, Brad Moran wrote:
>>
>>> At this point I have a loaded TDB and text index, now I am trying to query
>>> using java. First I tried creating a dataset directly from the TDB but
>>> received
>>>
>>> Dataset ds = TDBFactory.createDataset(**DBDirectory); <--How I created
>>> Dataset
>>>
>>> WARN  o.apache.jena.query.text.**TextQueryPF - Failed to find the text
>>> index
>>> : tried context and as a text-enabled dataset
>>> WARN  o.apache.jena.query.text.**TextQueryPF - No text index - no text
>>> search
>>> performed
>>> and returns an empty resultSet
>>>
>>> I am trying to run this query:
>>>
>>> QueryExecution qExec = QueryExecutionFactory.create(
>>>                           "PREFIX text: <http://jena.apache.org/text#>
>>> PREFIX
>>> mms: <http://rdf.cdisc.org/mms#> "
>>>                           + "SELECT * WHERE{?s text:query
>>> (mms:dataElementName 'AEACN')}", ds);
>>>
>>> ResultSet rs = qExec.execSelect();
>>>
>>> So I figured the problem could be that I some how need to combine the TDB
>>> and text index into the same dataset, I tried:
>>>
>>> String DBDirectory = "tdb";
>>> String indexDir = "luceneIndexes";
>>> File file = new File(indexDir);
>>> Directory dir = FSDirectory.open(file);
>>> TextIndex index = new TextIndexLucene(dir, null);// need to add the
>>> EntityDefinition?
>>>
>>
>> Yes.
>>
>>
>>   Dataset ds = TDBFactory.createDataset(**DBDirectory);
>>> Dataset dataset = TextDatasetFactory.create(ds, index);
>>>
>>
>> then you will need to use 'dataset' not 'ds' to query.
>>
>>
>>   This query does not run because of a nullPointerException.
>>>
>>
>> stacktrace?
>>
>>
>>   I am not sure if
>>> this is the right way to go about this. If this is the right way to
>>> combine
>>> a TDB and text index, is there an easy way to get the EntityDefinition
>>> from
>>> the text index?
>>>
>>
>> You can use the assembler file to get the dataset.
>> (see the text search documentation)
>>
>> Else
>> EntityDefinition entDef = new EntityDefinition("uri", "text", RDFS.label) ;
>> and add the further declarations as in the assembler file.
>>
>> The assembler means you can keep the configuation in one place
>>
>>          Andy
>>
>>
>

Re: Creating Dataset from loaded TDB and text index

Posted by Brad Moran <bm...@pinnacle21.net>.

I am not sure what I should be using as the second parameter for
DatasetFactory.assemble. Should I use the URI provided in the
documentation, or do I use the location of my own text dataset (my
luceneIndex?). Either way I receive an error.

If I use the URI provided, I get:

com.hp.hpl.jena.assembler.exceptions.AssemblerException: caught: Failed to
open:
/Users/brad/NetBeansProjects/mdr-older/trunk/NetBeansProjects/mdr-older/trunk/tdb/node2id.idn
(mode=rw)
  doing:
    root:
file:///Users/brad/NetBeansProjects/mdr-older/trunk/data.ttl#dataset with
    type:http://jena.hpl.hp.com/2008/tdb#DatasetTDB assembler class: class
com.hp.hpl.jena.tdb.assembler.DatasetAssemblerTDB
    root: http://localhost/jena_example/#text_dataset with type:
http://jena.apache.org/text#TextDatasetassembler class: class
org.apache.jena.query.text.assembler.TextDatasetAssembler


Otherwise if I use the location of lucene index on my file system I get:

com.hp.hpl.jena.assembler.exceptions.NoSpecificTypeException: the root
luceneIndexes has no most specific type that is a subclass of ja:Object


On Thu, Aug 22, 2013 at 2:14 AM, Andy Seaborne <an...@apache.org> wrote:

> On 21/08/13 21:45, Brad Moran wrote:
>
>> At this point I have a loaded TDB and text index, now I am trying to query
>> using java. First I tried creating a dataset directly from the TDB but
>> received
>>
>> Dataset ds = TDBFactory.createDataset(**DBDirectory); <--How I created
>> Dataset
>>
>> WARN  o.apache.jena.query.text.**TextQueryPF - Failed to find the text
>> index
>> : tried context and as a text-enabled dataset
>> WARN  o.apache.jena.query.text.**TextQueryPF - No text index - no text
>> search
>> performed
>> and returns an empty resultSet
>>
>> I am trying to run this query:
>>
>> QueryExecution qExec = QueryExecutionFactory.create(
>>                          "PREFIX text: <http://jena.apache.org/text#>
>> PREFIX
>> mms: <http://rdf.cdisc.org/mms#> "
>>                          + "SELECT * WHERE{?s text:query
>> (mms:dataElementName 'AEACN')}", ds);
>>
>> ResultSet rs = qExec.execSelect();
>>
>> So I figured the problem could be that I some how need to combine the TDB
>> and text index into the same dataset, I tried:
>>
>> String DBDirectory = "tdb";
>> String indexDir = "luceneIndexes";
>> File file = new File(indexDir);
>> Directory dir = FSDirectory.open(file);
>> TextIndex index = new TextIndexLucene(dir, null);// need to add the
>> EntityDefinition?
>>
>
> Yes.
>
>
>  Dataset ds = TDBFactory.createDataset(**DBDirectory);
>> Dataset dataset = TextDatasetFactory.create(ds, index);
>>
>
> then you will need to use 'dataset' not 'ds' to query.
>
>
>  This query does not run because of a nullPointerException.
>>
>
> stacktrace?
>
>
>  I am not sure if
>> this is the right way to go about this. If this is the right way to
>> combine
>> a TDB and text index, is there an easy way to get the EntityDefinition
>> from
>> the text index?
>>
>
> You can use the assembler file to get the dataset.
> (see the text search documentation)
>
> Else
> EntityDefinition entDef = new EntityDefinition("uri", "text", RDFS.label) ;
> and add the further declarations as in the assembler file.
>
> The assembler means you can keep the configuation in one place
>
>         Andy
>
>

Re: Creating Dataset from loaded TDB and text index

Posted by Andy Seaborne <an...@apache.org>.

On 21/08/13 21:45, Brad Moran wrote:
> At this point I have a loaded TDB and text index, now I am trying to query
> using java. First I tried creating a dataset directly from the TDB but
> received
>
> Dataset ds = TDBFactory.createDataset(DBDirectory); <--How I created Dataset
>
> WARN  o.apache.jena.query.text.TextQueryPF - Failed to find the text index
> : tried context and as a text-enabled dataset
> WARN  o.apache.jena.query.text.TextQueryPF - No text index - no text search
> performed
> and returns an empty resultSet
>
> I am trying to run this query:
>
> QueryExecution qExec = QueryExecutionFactory.create(
>                          "PREFIX text: <http://jena.apache.org/text#> PREFIX
> mms: <http://rdf.cdisc.org/mms#> "
>                          + "SELECT * WHERE{?s text:query
> (mms:dataElementName 'AEACN')}", ds);
>
> ResultSet rs = qExec.execSelect();
>
> So I figured the problem could be that I some how need to combine the TDB
> and text index into the same dataset, I tried:
>
> String DBDirectory = "tdb";
> String indexDir = "luceneIndexes";
> File file = new File(indexDir);
> Directory dir = FSDirectory.open(file);
> TextIndex index = new TextIndexLucene(dir, null);// need to add the
> EntityDefinition?

Yes.

> Dataset ds = TDBFactory.createDataset(DBDirectory);
> Dataset dataset = TextDatasetFactory.create(ds, index);

then you will need to use 'dataset' not 'ds' to query.

> This query does not run because of a nullPointerException.

stacktrace?

> I am not sure if
> this is the right way to go about this. If this is the right way to combine
> a TDB and text index, is there an easy way to get the EntityDefinition from
> the text index?

You can use the assembler file to get the dataset.
(see the text search documentation)

Else
EntityDefinition entDef = new EntityDefinition("uri", "text", RDFS.label) ;
and add the further declarations as in the assembler file.

The assembler means you can keep the configuation in one place

	Andy