You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by Alex To <to...@gmail.com> on 2019/09/12 01:53:45 UTC

Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Hi

I have so far been happy with Jena + Lucene / Elastic. Just trying to get a
quick answer whether it can work with other Jena based API like Virtuoso /
MarkLogic.

If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
expected ?

Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
interface, it may work but I am not sure because the "text:query" seems to
be more Jena specific.

I will try out myself in the next couple of days to see if it works but if
there is a quick answer it may save me a couple of hours :)

Thank a lot

Regards

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Dan Davis <da...@gmail.com>.

It would be of tremendous value to my project if this works; I wish I had
time to try it also.

On Wed, Sep 18, 2019, 10:03 PM Alex To <to...@gmail.com> wrote:

> Hi Dan
> Thanks for your suggestion but I am not trying to load large dataset yet.
>
> I am trying to see if I can use Jena Full text search with other Jena based
> API such as MarkLogic or Virtuoso but seems like it doesn't work as
> expected. Not a Jena problem though. My set up is
>
> 1. Input file: dbpedia.owl (2.5MB)
> 2. Import using MarkLogic Jena without TextDataset: 1 minute
> 3. Import using MarkLogic Jena with TextDataset wrapping about it: 13
> minutes
>
> Regards
>
> On Thu, Sep 19, 2019 at 10:54 AM Dan Davis <da...@gmail.com> wrote:
>
> > dbpedia is not actually that large.  Make sure you test with RDF datasets
> > that really represent your data.
> >
> > On Wed, Sep 18, 2019 at 8:14 PM Alex To <to...@gmail.com> wrote:
> >
> > > Update: I switched from Lucene to Elasticsearch 6.4.3 and Kibana. Both
> > Jena
> > > and MarkLogic Jena works with indexing, I haven't tried querying
> > MarkLogic
> > > with text:query though.
> > >
> > > Using Kibana, I could see the number of documents increasing while
> > > importing data with MarkLogic however it is very slow.
> > >
> > > Importing dbpedia.owl (2.5MB)  with MarkLogic Jena takes less than a
> > minute
> > > without indexing.
> > >
> > > With TextDataset wrapping around MarkLogic dataset, it takes 13 minutes
> > so
> > > I guess MarkLogic dataset does not seem to send triples in batch when
> > using
> > > with TextDataset.
> > >
> > >
> > >
> > > On Tue, Sep 17, 2019 at 9:58 AM Alex To <to...@gmail.com> wrote:
> > >
> > > > Hi Andy
> > > >
> > > > I ended up creating separate implementation for Jena and MarkLogic
> full
> > > > text search for now due to time constraints of the project. I will
> > > > investigate further  at a later time.
> > > >
> > > > Thank you
> > > >
> > > > Best Regards
> > > >
> > > > On Sun, Sep 15, 2019 at 6:53 PM Andy Seaborne <an...@apache.org>
> wrote:
> > > >
> > > >> Alex,
> > > >>
> > > >> I can't try it out - I don't have a Marklogic system.
> > > >>
> > > >> Can you see in the server logs what is happening?
> > > >>
> > > >>  > Pure speculation but parts 1 & 2 sounds like the data load is not
> > > going
> > > >>  > to MarkLogic as a single transaction but as "autocommit" - one
> > > >>  > transaction for each triple added.
> > > >>
> > > >>      Andy
> > > >>
> > > >> On 13/09/2019 23:04, Andy Seaborne wrote:
> > > >> > The maven central artifact com.marklogic:marklogic-jena is 3.0.6
> but
> > > >> our
> > > >> > code depends on 3.1.0 - what code is it using?
> > > >> >
> > > >> > On 13/09/2019 01:18, Alex To wrote:
> > > >> >> I created a small program to try out Lucene with MarkLogic Jena
> > here
> > > >> >>
> > > >> >>
> > > >>
> > >
> >
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> > > >> >>
> > > >> >>
> > > >> >>
> > > >> >> My observation is as follows (see my comment at line 54 & 56)
> > > >> >>
> > > >> >> 1. If the model reads a small file with 2 triples, the loading
> can
> > > >> finish
> > > >> >> quickly
> > > >> >> 2. If the model reads a slightly larger file (1.5MB), the loading
> > > takes
> > > >> >> forever so I have to terminate it
> > > >> >
> > > >> > Pure speculation but parts 1 & 2 sounds like the data load is not
> > > going
> > > >> > to MarkLogic as a single transaction but as "autocommit" - one
> > > >> > transaction for each triple added.
> > > >> >
> > > >> >      Andy
> > > >> >
> > > >> >
> > > >> >> 3. After loading the small file, searching the Lucene index
> direct
> > > >> shows
> > > >> >> that the triples are indexed
> > > >> >> 4. After loading the small file, run SPARQL query with
> "text:query"
> > > >> won't
> > > >> >> finish
> > > >> >>
> > > >> >> For now I created 2 separate implementation in my program to
> > support
> > > >> Full
> > > >> >> Text search with Jena or MarkLogic but I look forward to know
> more
> > > >> >> whether
> > > >> >> it is still possible to use Jena Elastic indexing with
> TextDataset
> > > >> >> because
> > > >> >> then I can provide a single UI to users to configure their search
> > > >> >> regardless of the back end. :)
> > > >> >>
> > > >> >>
> > > >> >> On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <da...@gmail.com>
> > > wrote:
> > > >> >>
> > > >> >>> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes
> > an
> > > >> >>> implementation of Dataset, and so while application is only
> using
> > > the
> > > >> >>> virtuoso.jena.driver.VirtGraph and
> > > >> >>> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a
> more
> > > >> >>> flexible
> > > >> >>> integration is possible. I look forward to experimenting with it
> > and
> > > >> >>> seeing
> > > >> >>> what I can do on the backend.
> > > >> >>>
> > > >> >>> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com>
> > > >> wrote:
> > > >> >>>
> > > >> >>>> Virtuoso's Jena driver implements the model interface, rather
> > than
> > > >> the
> > > >> >>>> DatasetGraphAPI.  is translating the SPARQL query into its own
> > JDBC
> > > >> >>>> interface. You can see the architecture at
> > > >> >>>>
> > > >> >>>
> > > >>
> > >
> >
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv
> > > .
> > > >>
> > > >> >>>
> > > >> >>> However,
> > > >> >>>> Virtuoso has its own full-text indexing, which can be
> effective.
> > > Its
> > > >> >>> rules
> > > >> >>>> for translating words into queries is not as flexible as
> > > >> >>>> lucene/solr/elastic, but it does allow you to specify what
> should
> > > be
> > > >> >>>> indexed - e.g. which objects from which which data properties
> in
> > > >> which
> > > >> >>>> graphs.
> > > >> >>>>
> > > >> >>>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the
> > > code
> > > >> at
> > > >> >>>> https://github.com/HHS/lodestar, which is run underneath
> > > >> >>>> https://github.com/HHS/meshrdf.   You will see that
> > > >> >>>> https://github.com/HHS/lodestar is a fork from EBI, but the
> NLM
> > > >> copy
> > > >> >>>> has
> > > >> >>>> been updated to Jena 3. The EBI version is ahead on UI features
> > > >> >>>> however.
> > > >> >>>>
> > > >> >>>> I cannot speak to MarkLogic, Stardog, etc.
> > > >> >>>>
> > > >> >>>>
> > > >> >>>>
> > > >> >>>>
> > > >> >>>>
> > > >> >>>> EBI's lodestar still uses Jena 2, but the fork at HHS has been
> > > >> >>>> updated to
> > > >> >>>> Jena 3.
> > > >> >>>>
> > > >> >>>> Virtuoso has its own full-text indexing, which is not as
> flexible
> > > in
> > > >> >>>> how
> > > >> >>>> it indexes as Elastic/Solr/Lucene.   It still works.
> > > >> >>>>
> > > >> >>>>
> > > >> >>>>
> > > >> >>>>
> > > >> >>>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <andy@apache.org
> >
> > > >> wrote:
> > > >> >>>>
> > > >> >>>>> Yes, probably - but.
> > > >> >>>>>
> > > >> >>>>> The Jena text index will work in conjunction with any (Jena)
> > > >> >>>>> DatasetGraphAPI implementation. 3rd party systems are not
> tested
> > > in
> > > >> >>>>> the
> > > >> >>>>> build.
> > > >> >>>>>
> > > >> >>>>> The "but" is efficiency. Both those systems have their own
> > > built-in
> > > >> >>>>> text
> > > >> >>>>> indexing which execute as part of the native query engine.
> This
> > > may
> > > >> >>>>> be a
> > > >> >>>>> factor for you, it may not.
> > > >> >>>>>
> > > >> >>>>> Let us know how you get on trying it.
> > > >> >>>>>
> > > >> >>>>> ----
> > > >> >>>>>
> > > >> >>>>> There is a SPARQL 1.2 issue about standardizing text query.
> > > >> >>>>>
> > > >> >>>>> Issue 40 : SPARQL 1.2 Community Group:
> > > >> >>>>> https://github.com/w3c/sparql-12/issues/40
> > > >> >>>>>
> > > >> >>>>>       Andy
> > > >> >>>>>
> > > >> >>>>> On 12/09/2019 02:53, Alex To wrote:
> > > >> >>>>>> Hi
> > > >> >>>>>>
> > > >> >>>>>> I have so far been happy with Jena + Lucene / Elastic. Just
> > > trying
> > > >> to
> > > >> >>>>> get a
> > > >> >>>>>> quick answer whether it can work with other Jena based API
> like
> > > >> >>>>> Virtuoso /
> > > >> >>>>>> MarkLogic.
> > > >> >>>>>>
> > > >> >>>>>> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it
> > work
> > > as
> > > >> >>>>>> expected ?
> > > >> >>>>>>
> > > >> >>>>>> Given that a MarkLogic / Virtuoso Dataset implements Jena
> > Dataset
> > > >> >>>>>> interface, it may work but I am not sure because the
> > "text:query"
> > > >> >>> seems
> > > >> >>>>> to
> > > >> >>>>>> be more Jena specific.
> > > >> >>>>>>
> > > >> >>>>>> I will try out myself in the next couple of days to see if it
> > > works
> > > >> >>> but
> > > >> >>>>> if
> > > >> >>>>>> there is a quick answer it may save me a couple of hours :)
> > > >> >>>>>>
> > > >> >>>>>> Thank a lot
> > > >> >>>>>>
> > > >> >>>>>> Regards
> > > >> >>>>>>
> > > >> >>>>>
> > > >> >>>>
> > > >> >>>
> > > >> >>
> > > >> >>
> > > >
> > > >
> > >
> >
>
>
> --
>
> Alex To
>
> PhD Candidate
>
> School of Computer Science
>
> Knowledge Discovery and Management Research Group
>
> Faculty of Engineering & IT
>
> THE UNIVERSITY OF SYDNEY | NSW | 2006
>
> Desk 4e69 | Building J12| 1 Cleveland Street
>
> M. +61423330656 <%2B61450061602>
>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Alex To <to...@gmail.com>.

Hi Dan
Thanks for your suggestion but I am not trying to load large dataset yet.

I am trying to see if I can use Jena Full text search with other Jena based
API such as MarkLogic or Virtuoso but seems like it doesn't work as
expected. Not a Jena problem though. My set up is

1. Input file: dbpedia.owl (2.5MB)
2. Import using MarkLogic Jena without TextDataset: 1 minute
3. Import using MarkLogic Jena with TextDataset wrapping about it: 13
minutes

Regards

On Thu, Sep 19, 2019 at 10:54 AM Dan Davis <da...@gmail.com> wrote:

> dbpedia is not actually that large.  Make sure you test with RDF datasets
> that really represent your data.
>
> On Wed, Sep 18, 2019 at 8:14 PM Alex To <to...@gmail.com> wrote:
>
> > Update: I switched from Lucene to Elasticsearch 6.4.3 and Kibana. Both
> Jena
> > and MarkLogic Jena works with indexing, I haven't tried querying
> MarkLogic
> > with text:query though.
> >
> > Using Kibana, I could see the number of documents increasing while
> > importing data with MarkLogic however it is very slow.
> >
> > Importing dbpedia.owl (2.5MB)  with MarkLogic Jena takes less than a
> minute
> > without indexing.
> >
> > With TextDataset wrapping around MarkLogic dataset, it takes 13 minutes
> so
> > I guess MarkLogic dataset does not seem to send triples in batch when
> using
> > with TextDataset.
> >
> >
> >
> > On Tue, Sep 17, 2019 at 9:58 AM Alex To <to...@gmail.com> wrote:
> >
> > > Hi Andy
> > >
> > > I ended up creating separate implementation for Jena and MarkLogic full
> > > text search for now due to time constraints of the project. I will
> > > investigate further  at a later time.
> > >
> > > Thank you
> > >
> > > Best Regards
> > >
> > > On Sun, Sep 15, 2019 at 6:53 PM Andy Seaborne <an...@apache.org> wrote:
> > >
> > >> Alex,
> > >>
> > >> I can't try it out - I don't have a Marklogic system.
> > >>
> > >> Can you see in the server logs what is happening?
> > >>
> > >>  > Pure speculation but parts 1 & 2 sounds like the data load is not
> > going
> > >>  > to MarkLogic as a single transaction but as "autocommit" - one
> > >>  > transaction for each triple added.
> > >>
> > >>      Andy
> > >>
> > >> On 13/09/2019 23:04, Andy Seaborne wrote:
> > >> > The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but
> > >> our
> > >> > code depends on 3.1.0 - what code is it using?
> > >> >
> > >> > On 13/09/2019 01:18, Alex To wrote:
> > >> >> I created a small program to try out Lucene with MarkLogic Jena
> here
> > >> >>
> > >> >>
> > >>
> >
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> > >> >>
> > >> >>
> > >> >>
> > >> >> My observation is as follows (see my comment at line 54 & 56)
> > >> >>
> > >> >> 1. If the model reads a small file with 2 triples, the loading can
> > >> finish
> > >> >> quickly
> > >> >> 2. If the model reads a slightly larger file (1.5MB), the loading
> > takes
> > >> >> forever so I have to terminate it
> > >> >
> > >> > Pure speculation but parts 1 & 2 sounds like the data load is not
> > going
> > >> > to MarkLogic as a single transaction but as "autocommit" - one
> > >> > transaction for each triple added.
> > >> >
> > >> >      Andy
> > >> >
> > >> >
> > >> >> 3. After loading the small file, searching the Lucene index direct
> > >> shows
> > >> >> that the triples are indexed
> > >> >> 4. After loading the small file, run SPARQL query with "text:query"
> > >> won't
> > >> >> finish
> > >> >>
> > >> >> For now I created 2 separate implementation in my program to
> support
> > >> Full
> > >> >> Text search with Jena or MarkLogic but I look forward to know more
> > >> >> whether
> > >> >> it is still possible to use Jena Elastic indexing with TextDataset
> > >> >> because
> > >> >> then I can provide a single UI to users to configure their search
> > >> >> regardless of the back end. :)
> > >> >>
> > >> >>
> > >> >> On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <da...@gmail.com>
> > wrote:
> > >> >>
> > >> >>> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes
> an
> > >> >>> implementation of Dataset, and so while application is only using
> > the
> > >> >>> virtuoso.jena.driver.VirtGraph and
> > >> >>> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more
> > >> >>> flexible
> > >> >>> integration is possible. I look forward to experimenting with it
> and
> > >> >>> seeing
> > >> >>> what I can do on the backend.
> > >> >>>
> > >> >>> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com>
> > >> wrote:
> > >> >>>
> > >> >>>> Virtuoso's Jena driver implements the model interface, rather
> than
> > >> the
> > >> >>>> DatasetGraphAPI.  is translating the SPARQL query into its own
> JDBC
> > >> >>>> interface. You can see the architecture at
> > >> >>>>
> > >> >>>
> > >>
> >
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv
> > .
> > >>
> > >> >>>
> > >> >>> However,
> > >> >>>> Virtuoso has its own full-text indexing, which can be effective.
> > Its
> > >> >>> rules
> > >> >>>> for translating words into queries is not as flexible as
> > >> >>>> lucene/solr/elastic, but it does allow you to specify what should
> > be
> > >> >>>> indexed - e.g. which objects from which which data properties in
> > >> which
> > >> >>>> graphs.
> > >> >>>>
> > >> >>>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the
> > code
> > >> at
> > >> >>>> https://github.com/HHS/lodestar, which is run underneath
> > >> >>>> https://github.com/HHS/meshrdf.   You will see that
> > >> >>>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM
> > >> copy
> > >> >>>> has
> > >> >>>> been updated to Jena 3. The EBI version is ahead on UI features
> > >> >>>> however.
> > >> >>>>
> > >> >>>> I cannot speak to MarkLogic, Stardog, etc.
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>> EBI's lodestar still uses Jena 2, but the fork at HHS has been
> > >> >>>> updated to
> > >> >>>> Jena 3.
> > >> >>>>
> > >> >>>> Virtuoso has its own full-text indexing, which is not as flexible
> > in
> > >> >>>> how
> > >> >>>> it indexes as Elastic/Solr/Lucene.   It still works.
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org>
> > >> wrote:
> > >> >>>>
> > >> >>>>> Yes, probably - but.
> > >> >>>>>
> > >> >>>>> The Jena text index will work in conjunction with any (Jena)
> > >> >>>>> DatasetGraphAPI implementation. 3rd party systems are not tested
> > in
> > >> >>>>> the
> > >> >>>>> build.
> > >> >>>>>
> > >> >>>>> The "but" is efficiency. Both those systems have their own
> > built-in
> > >> >>>>> text
> > >> >>>>> indexing which execute as part of the native query engine. This
> > may
> > >> >>>>> be a
> > >> >>>>> factor for you, it may not.
> > >> >>>>>
> > >> >>>>> Let us know how you get on trying it.
> > >> >>>>>
> > >> >>>>> ----
> > >> >>>>>
> > >> >>>>> There is a SPARQL 1.2 issue about standardizing text query.
> > >> >>>>>
> > >> >>>>> Issue 40 : SPARQL 1.2 Community Group:
> > >> >>>>> https://github.com/w3c/sparql-12/issues/40
> > >> >>>>>
> > >> >>>>>       Andy
> > >> >>>>>
> > >> >>>>> On 12/09/2019 02:53, Alex To wrote:
> > >> >>>>>> Hi
> > >> >>>>>>
> > >> >>>>>> I have so far been happy with Jena + Lucene / Elastic. Just
> > trying
> > >> to
> > >> >>>>> get a
> > >> >>>>>> quick answer whether it can work with other Jena based API like
> > >> >>>>> Virtuoso /
> > >> >>>>>> MarkLogic.
> > >> >>>>>>
> > >> >>>>>> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it
> work
> > as
> > >> >>>>>> expected ?
> > >> >>>>>>
> > >> >>>>>> Given that a MarkLogic / Virtuoso Dataset implements Jena
> Dataset
> > >> >>>>>> interface, it may work but I am not sure because the
> "text:query"
> > >> >>> seems
> > >> >>>>> to
> > >> >>>>>> be more Jena specific.
> > >> >>>>>>
> > >> >>>>>> I will try out myself in the next couple of days to see if it
> > works
> > >> >>> but
> > >> >>>>> if
> > >> >>>>>> there is a quick answer it may save me a couple of hours :)
> > >> >>>>>>
> > >> >>>>>> Thank a lot
> > >> >>>>>>
> > >> >>>>>> Regards
> > >> >>>>>>
> > >> >>>>>
> > >> >>>>
> > >> >>>
> > >> >>
> > >> >>
> > >
> > >
> >
>


-- 

Alex To

PhD Candidate

School of Computer Science

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 <%2B61450061602>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Dan Davis <da...@gmail.com>.

dbpedia is not actually that large.  Make sure you test with RDF datasets
that really represent your data.

On Wed, Sep 18, 2019 at 8:14 PM Alex To <to...@gmail.com> wrote:

> Update: I switched from Lucene to Elasticsearch 6.4.3 and Kibana. Both Jena
> and MarkLogic Jena works with indexing, I haven't tried querying MarkLogic
> with text:query though.
>
> Using Kibana, I could see the number of documents increasing while
> importing data with MarkLogic however it is very slow.
>
> Importing dbpedia.owl (2.5MB)  with MarkLogic Jena takes less than a minute
> without indexing.
>
> With TextDataset wrapping around MarkLogic dataset, it takes 13 minutes so
> I guess MarkLogic dataset does not seem to send triples in batch when using
> with TextDataset.
>
>
>
> On Tue, Sep 17, 2019 at 9:58 AM Alex To <to...@gmail.com> wrote:
>
> > Hi Andy
> >
> > I ended up creating separate implementation for Jena and MarkLogic full
> > text search for now due to time constraints of the project. I will
> > investigate further  at a later time.
> >
> > Thank you
> >
> > Best Regards
> >
> > On Sun, Sep 15, 2019 at 6:53 PM Andy Seaborne <an...@apache.org> wrote:
> >
> >> Alex,
> >>
> >> I can't try it out - I don't have a Marklogic system.
> >>
> >> Can you see in the server logs what is happening?
> >>
> >>  > Pure speculation but parts 1 & 2 sounds like the data load is not
> going
> >>  > to MarkLogic as a single transaction but as "autocommit" - one
> >>  > transaction for each triple added.
> >>
> >>      Andy
> >>
> >> On 13/09/2019 23:04, Andy Seaborne wrote:
> >> > The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but
> >> our
> >> > code depends on 3.1.0 - what code is it using?
> >> >
> >> > On 13/09/2019 01:18, Alex To wrote:
> >> >> I created a small program to try out Lucene with MarkLogic Jena here
> >> >>
> >> >>
> >>
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> >> >>
> >> >>
> >> >>
> >> >> My observation is as follows (see my comment at line 54 & 56)
> >> >>
> >> >> 1. If the model reads a small file with 2 triples, the loading can
> >> finish
> >> >> quickly
> >> >> 2. If the model reads a slightly larger file (1.5MB), the loading
> takes
> >> >> forever so I have to terminate it
> >> >
> >> > Pure speculation but parts 1 & 2 sounds like the data load is not
> going
> >> > to MarkLogic as a single transaction but as "autocommit" - one
> >> > transaction for each triple added.
> >> >
> >> >      Andy
> >> >
> >> >
> >> >> 3. After loading the small file, searching the Lucene index direct
> >> shows
> >> >> that the triples are indexed
> >> >> 4. After loading the small file, run SPARQL query with "text:query"
> >> won't
> >> >> finish
> >> >>
> >> >> For now I created 2 separate implementation in my program to support
> >> Full
> >> >> Text search with Jena or MarkLogic but I look forward to know more
> >> >> whether
> >> >> it is still possible to use Jena Elastic indexing with TextDataset
> >> >> because
> >> >> then I can provide a single UI to users to configure their search
> >> >> regardless of the back end. :)
> >> >>
> >> >>
> >> >> On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <da...@gmail.com>
> wrote:
> >> >>
> >> >>> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
> >> >>> implementation of Dataset, and so while application is only using
> the
> >> >>> virtuoso.jena.driver.VirtGraph and
> >> >>> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more
> >> >>> flexible
> >> >>> integration is possible. I look forward to experimenting with it and
> >> >>> seeing
> >> >>> what I can do on the backend.
> >> >>>
> >> >>> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com>
> >> wrote:
> >> >>>
> >> >>>> Virtuoso's Jena driver implements the model interface, rather than
> >> the
> >> >>>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
> >> >>>> interface. You can see the architecture at
> >> >>>>
> >> >>>
> >>
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv
> .
> >>
> >> >>>
> >> >>> However,
> >> >>>> Virtuoso has its own full-text indexing, which can be effective.
> Its
> >> >>> rules
> >> >>>> for translating words into queries is not as flexible as
> >> >>>> lucene/solr/elastic, but it does allow you to specify what should
> be
> >> >>>> indexed - e.g. which objects from which which data properties in
> >> which
> >> >>>> graphs.
> >> >>>>
> >> >>>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the
> code
> >> at
> >> >>>> https://github.com/HHS/lodestar, which is run underneath
> >> >>>> https://github.com/HHS/meshrdf.   You will see that
> >> >>>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM
> >> copy
> >> >>>> has
> >> >>>> been updated to Jena 3. The EBI version is ahead on UI features
> >> >>>> however.
> >> >>>>
> >> >>>> I cannot speak to MarkLogic, Stardog, etc.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> EBI's lodestar still uses Jena 2, but the fork at HHS has been
> >> >>>> updated to
> >> >>>> Jena 3.
> >> >>>>
> >> >>>> Virtuoso has its own full-text indexing, which is not as flexible
> in
> >> >>>> how
> >> >>>> it indexes as Elastic/Solr/Lucene.   It still works.
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org>
> >> wrote:
> >> >>>>
> >> >>>>> Yes, probably - but.
> >> >>>>>
> >> >>>>> The Jena text index will work in conjunction with any (Jena)
> >> >>>>> DatasetGraphAPI implementation. 3rd party systems are not tested
> in
> >> >>>>> the
> >> >>>>> build.
> >> >>>>>
> >> >>>>> The "but" is efficiency. Both those systems have their own
> built-in
> >> >>>>> text
> >> >>>>> indexing which execute as part of the native query engine. This
> may
> >> >>>>> be a
> >> >>>>> factor for you, it may not.
> >> >>>>>
> >> >>>>> Let us know how you get on trying it.
> >> >>>>>
> >> >>>>> ----
> >> >>>>>
> >> >>>>> There is a SPARQL 1.2 issue about standardizing text query.
> >> >>>>>
> >> >>>>> Issue 40 : SPARQL 1.2 Community Group:
> >> >>>>> https://github.com/w3c/sparql-12/issues/40
> >> >>>>>
> >> >>>>>       Andy
> >> >>>>>
> >> >>>>> On 12/09/2019 02:53, Alex To wrote:
> >> >>>>>> Hi
> >> >>>>>>
> >> >>>>>> I have so far been happy with Jena + Lucene / Elastic. Just
> trying
> >> to
> >> >>>>> get a
> >> >>>>>> quick answer whether it can work with other Jena based API like
> >> >>>>> Virtuoso /
> >> >>>>>> MarkLogic.
> >> >>>>>>
> >> >>>>>> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work
> as
> >> >>>>>> expected ?
> >> >>>>>>
> >> >>>>>> Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
> >> >>>>>> interface, it may work but I am not sure because the "text:query"
> >> >>> seems
> >> >>>>> to
> >> >>>>>> be more Jena specific.
> >> >>>>>>
> >> >>>>>> I will try out myself in the next couple of days to see if it
> works
> >> >>> but
> >> >>>>> if
> >> >>>>>> there is a quick answer it may save me a couple of hours :)
> >> >>>>>>
> >> >>>>>> Thank a lot
> >> >>>>>>
> >> >>>>>> Regards
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >> >>
> >
> >
>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Alex To <to...@gmail.com>.

Update: I switched from Lucene to Elasticsearch 6.4.3 and Kibana. Both Jena
and MarkLogic Jena works with indexing, I haven't tried querying MarkLogic
with text:query though.

Using Kibana, I could see the number of documents increasing while
importing data with MarkLogic however it is very slow.

Importing dbpedia.owl (2.5MB)  with MarkLogic Jena takes less than a minute
without indexing.

With TextDataset wrapping around MarkLogic dataset, it takes 13 minutes so
I guess MarkLogic dataset does not seem to send triples in batch when using
with TextDataset.



On Tue, Sep 17, 2019 at 9:58 AM Alex To <to...@gmail.com> wrote:

> Hi Andy
>
> I ended up creating separate implementation for Jena and MarkLogic full
> text search for now due to time constraints of the project. I will
> investigate further  at a later time.
>
> Thank you
>
> Best Regards
>
> On Sun, Sep 15, 2019 at 6:53 PM Andy Seaborne <an...@apache.org> wrote:
>
>> Alex,
>>
>> I can't try it out - I don't have a Marklogic system.
>>
>> Can you see in the server logs what is happening?
>>
>>  > Pure speculation but parts 1 & 2 sounds like the data load is not going
>>  > to MarkLogic as a single transaction but as "autocommit" - one
>>  > transaction for each triple added.
>>
>>      Andy
>>
>> On 13/09/2019 23:04, Andy Seaborne wrote:
>> > The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but
>> our
>> > code depends on 3.1.0 - what code is it using?
>> >
>> > On 13/09/2019 01:18, Alex To wrote:
>> >> I created a small program to try out Lucene with MarkLogic Jena here
>> >>
>> >>
>> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
>> >>
>> >>
>> >>
>> >> My observation is as follows (see my comment at line 54 & 56)
>> >>
>> >> 1. If the model reads a small file with 2 triples, the loading can
>> finish
>> >> quickly
>> >> 2. If the model reads a slightly larger file (1.5MB), the loading takes
>> >> forever so I have to terminate it
>> >
>> > Pure speculation but parts 1 & 2 sounds like the data load is not going
>> > to MarkLogic as a single transaction but as "autocommit" - one
>> > transaction for each triple added.
>> >
>> >      Andy
>> >
>> >
>> >> 3. After loading the small file, searching the Lucene index direct
>> shows
>> >> that the triples are indexed
>> >> 4. After loading the small file, run SPARQL query with "text:query"
>> won't
>> >> finish
>> >>
>> >> For now I created 2 separate implementation in my program to support
>> Full
>> >> Text search with Jena or MarkLogic but I look forward to know more
>> >> whether
>> >> it is still possible to use Jena Elastic indexing with TextDataset
>> >> because
>> >> then I can provide a single UI to users to configure their search
>> >> regardless of the back end. :)
>> >>
>> >>
>> >> On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <da...@gmail.com> wrote:
>> >>
>> >>> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
>> >>> implementation of Dataset, and so while application is only using the
>> >>> virtuoso.jena.driver.VirtGraph and
>> >>> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more
>> >>> flexible
>> >>> integration is possible. I look forward to experimenting with it and
>> >>> seeing
>> >>> what I can do on the backend.
>> >>>
>> >>> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com>
>> wrote:
>> >>>
>> >>>> Virtuoso's Jena driver implements the model interface, rather than
>> the
>> >>>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
>> >>>> interface. You can see the architecture at
>> >>>>
>> >>>
>> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv.
>>
>> >>>
>> >>> However,
>> >>>> Virtuoso has its own full-text indexing, which can be effective. Its
>> >>> rules
>> >>>> for translating words into queries is not as flexible as
>> >>>> lucene/solr/elastic, but it does allow you to specify what should be
>> >>>> indexed - e.g. which objects from which which data properties in
>> which
>> >>>> graphs.
>> >>>>
>> >>>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code
>> at
>> >>>> https://github.com/HHS/lodestar, which is run underneath
>> >>>> https://github.com/HHS/meshrdf.   You will see that
>> >>>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM
>> copy
>> >>>> has
>> >>>> been updated to Jena 3. The EBI version is ahead on UI features
>> >>>> however.
>> >>>>
>> >>>> I cannot speak to MarkLogic, Stardog, etc.
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> EBI's lodestar still uses Jena 2, but the fork at HHS has been
>> >>>> updated to
>> >>>> Jena 3.
>> >>>>
>> >>>> Virtuoso has its own full-text indexing, which is not as flexible in
>> >>>> how
>> >>>> it indexes as Elastic/Solr/Lucene.   It still works.
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org>
>> wrote:
>> >>>>
>> >>>>> Yes, probably - but.
>> >>>>>
>> >>>>> The Jena text index will work in conjunction with any (Jena)
>> >>>>> DatasetGraphAPI implementation. 3rd party systems are not tested in
>> >>>>> the
>> >>>>> build.
>> >>>>>
>> >>>>> The "but" is efficiency. Both those systems have their own built-in
>> >>>>> text
>> >>>>> indexing which execute as part of the native query engine. This may
>> >>>>> be a
>> >>>>> factor for you, it may not.
>> >>>>>
>> >>>>> Let us know how you get on trying it.
>> >>>>>
>> >>>>> ----
>> >>>>>
>> >>>>> There is a SPARQL 1.2 issue about standardizing text query.
>> >>>>>
>> >>>>> Issue 40 : SPARQL 1.2 Community Group:
>> >>>>> https://github.com/w3c/sparql-12/issues/40
>> >>>>>
>> >>>>>       Andy
>> >>>>>
>> >>>>> On 12/09/2019 02:53, Alex To wrote:
>> >>>>>> Hi
>> >>>>>>
>> >>>>>> I have so far been happy with Jena + Lucene / Elastic. Just trying
>> to
>> >>>>> get a
>> >>>>>> quick answer whether it can work with other Jena based API like
>> >>>>> Virtuoso /
>> >>>>>> MarkLogic.
>> >>>>>>
>> >>>>>> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
>> >>>>>> expected ?
>> >>>>>>
>> >>>>>> Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
>> >>>>>> interface, it may work but I am not sure because the "text:query"
>> >>> seems
>> >>>>> to
>> >>>>>> be more Jena specific.
>> >>>>>>
>> >>>>>> I will try out myself in the next couple of days to see if it works
>> >>> but
>> >>>>> if
>> >>>>>> there is a quick answer it may save me a couple of hours :)
>> >>>>>>
>> >>>>>> Thank a lot
>> >>>>>>
>> >>>>>> Regards
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>> >>
>
>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Alex To <to...@gmail.com>.

Hi Andy

I ended up creating separate implementation for Jena and MarkLogic full
text search for now due to time constraints of the project. I will
investigate further  at a later time.

Thank you

Best Regards

On Sun, Sep 15, 2019 at 6:53 PM Andy Seaborne <an...@apache.org> wrote:

> Alex,
>
> I can't try it out - I don't have a Marklogic system.
>
> Can you see in the server logs what is happening?
>
>  > Pure speculation but parts 1 & 2 sounds like the data load is not going
>  > to MarkLogic as a single transaction but as "autocommit" - one
>  > transaction for each triple added.
>
>      Andy
>
> On 13/09/2019 23:04, Andy Seaborne wrote:
> > The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but our
> > code depends on 3.1.0 - what code is it using?
> >
> > On 13/09/2019 01:18, Alex To wrote:
> >> I created a small program to try out Lucene with MarkLogic Jena here
> >>
> >>
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> >>
> >>
> >>
> >> My observation is as follows (see my comment at line 54 & 56)
> >>
> >> 1. If the model reads a small file with 2 triples, the loading can
> finish
> >> quickly
> >> 2. If the model reads a slightly larger file (1.5MB), the loading takes
> >> forever so I have to terminate it
> >
> > Pure speculation but parts 1 & 2 sounds like the data load is not going
> > to MarkLogic as a single transaction but as "autocommit" - one
> > transaction for each triple added.
> >
> >      Andy
> >
> >
> >> 3. After loading the small file, searching the Lucene index direct shows
> >> that the triples are indexed
> >> 4. After loading the small file, run SPARQL query with "text:query"
> won't
> >> finish
> >>
> >> For now I created 2 separate implementation in my program to support
> Full
> >> Text search with Jena or MarkLogic but I look forward to know more
> >> whether
> >> it is still possible to use Jena Elastic indexing with TextDataset
> >> because
> >> then I can provide a single UI to users to configure their search
> >> regardless of the back end. :)
> >>
> >>
> >> On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <da...@gmail.com> wrote:
> >>
> >>> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
> >>> implementation of Dataset, and so while application is only using the
> >>> virtuoso.jena.driver.VirtGraph and
> >>> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more
> >>> flexible
> >>> integration is possible. I look forward to experimenting with it and
> >>> seeing
> >>> what I can do on the backend.
> >>>
> >>> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com> wrote:
> >>>
> >>>> Virtuoso's Jena driver implements the model interface, rather than the
> >>>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
> >>>> interface. You can see the architecture at
> >>>>
> >>>
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv.
>
> >>>
> >>> However,
> >>>> Virtuoso has its own full-text indexing, which can be effective. Its
> >>> rules
> >>>> for translating words into queries is not as flexible as
> >>>> lucene/solr/elastic, but it does allow you to specify what should be
> >>>> indexed - e.g. which objects from which which data properties in which
> >>>> graphs.
> >>>>
> >>>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code
> at
> >>>> https://github.com/HHS/lodestar, which is run underneath
> >>>> https://github.com/HHS/meshrdf.   You will see that
> >>>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy
> >>>> has
> >>>> been updated to Jena 3. The EBI version is ahead on UI features
> >>>> however.
> >>>>
> >>>> I cannot speak to MarkLogic, Stardog, etc.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> EBI's lodestar still uses Jena 2, but the fork at HHS has been
> >>>> updated to
> >>>> Jena 3.
> >>>>
> >>>> Virtuoso has its own full-text indexing, which is not as flexible in
> >>>> how
> >>>> it indexes as Elastic/Solr/Lucene.   It still works.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org>
> wrote:
> >>>>
> >>>>> Yes, probably - but.
> >>>>>
> >>>>> The Jena text index will work in conjunction with any (Jena)
> >>>>> DatasetGraphAPI implementation. 3rd party systems are not tested in
> >>>>> the
> >>>>> build.
> >>>>>
> >>>>> The "but" is efficiency. Both those systems have their own built-in
> >>>>> text
> >>>>> indexing which execute as part of the native query engine. This may
> >>>>> be a
> >>>>> factor for you, it may not.
> >>>>>
> >>>>> Let us know how you get on trying it.
> >>>>>
> >>>>> ----
> >>>>>
> >>>>> There is a SPARQL 1.2 issue about standardizing text query.
> >>>>>
> >>>>> Issue 40 : SPARQL 1.2 Community Group:
> >>>>> https://github.com/w3c/sparql-12/issues/40
> >>>>>
> >>>>>       Andy
> >>>>>
> >>>>> On 12/09/2019 02:53, Alex To wrote:
> >>>>>> Hi
> >>>>>>
> >>>>>> I have so far been happy with Jena + Lucene / Elastic. Just trying
> to
> >>>>> get a
> >>>>>> quick answer whether it can work with other Jena based API like
> >>>>> Virtuoso /
> >>>>>> MarkLogic.
> >>>>>>
> >>>>>> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
> >>>>>> expected ?
> >>>>>>
> >>>>>> Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
> >>>>>> interface, it may work but I am not sure because the "text:query"
> >>> seems
> >>>>> to
> >>>>>> be more Jena specific.
> >>>>>>
> >>>>>> I will try out myself in the next couple of days to see if it works
> >>> but
> >>>>> if
> >>>>>> there is a quick answer it may save me a couple of hours :)
> >>>>>>
> >>>>>> Thank a lot
> >>>>>>
> >>>>>> Regards
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Andy Seaborne <an...@apache.org>.

Alex,

I can't try it out - I don't have a Marklogic system.

Can you see in the server logs what is happening?

 > Pure speculation but parts 1 & 2 sounds like the data load is not going
 > to MarkLogic as a single transaction but as "autocommit" - one
 > transaction for each triple added.

     Andy

On 13/09/2019 23:04, Andy Seaborne wrote:
> The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but our 
> code depends on 3.1.0 - what code is it using?
> 
> On 13/09/2019 01:18, Alex To wrote:
>> I created a small program to try out Lucene with MarkLogic Jena here
>>
>> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java 
>>
>>
>>
>> My observation is as follows (see my comment at line 54 & 56)
>>
>> 1. If the model reads a small file with 2 triples, the loading can finish
>> quickly
>> 2. If the model reads a slightly larger file (1.5MB), the loading takes
>> forever so I have to terminate it
> 
> Pure speculation but parts 1 & 2 sounds like the data load is not going 
> to MarkLogic as a single transaction but as "autocommit" - one 
> transaction for each triple added.
> 
>      Andy
> 
> 
>> 3. After loading the small file, searching the Lucene index direct shows
>> that the triples are indexed
>> 4. After loading the small file, run SPARQL query with "text:query" won't
>> finish
>>
>> For now I created 2 separate implementation in my program to support Full
>> Text search with Jena or MarkLogic but I look forward to know more 
>> whether
>> it is still possible to use Jena Elastic indexing with TextDataset 
>> because
>> then I can provide a single UI to users to configure their search
>> regardless of the back end. :)
>>
>>
>> On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <da...@gmail.com> wrote:
>>
>>> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
>>> implementation of Dataset, and so while application is only using the
>>> virtuoso.jena.driver.VirtGraph and
>>> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more 
>>> flexible
>>> integration is possible. I look forward to experimenting with it and 
>>> seeing
>>> what I can do on the backend.
>>>
>>> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com> wrote:
>>>
>>>> Virtuoso's Jena driver implements the model interface, rather than the
>>>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
>>>> interface. You can see the architecture at
>>>>
>>> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv. 
>>>
>>> However,
>>>> Virtuoso has its own full-text indexing, which can be effective. Its
>>> rules
>>>> for translating words into queries is not as flexible as
>>>> lucene/solr/elastic, but it does allow you to specify what should be
>>>> indexed - e.g. which objects from which which data properties in which
>>>> graphs.
>>>>
>>>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code at
>>>> https://github.com/HHS/lodestar, which is run underneath
>>>> https://github.com/HHS/meshrdf.   You will see that
>>>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy 
>>>> has
>>>> been updated to Jena 3. The EBI version is ahead on UI features 
>>>> however.
>>>>
>>>> I cannot speak to MarkLogic, Stardog, etc.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> EBI's lodestar still uses Jena 2, but the fork at HHS has been 
>>>> updated to
>>>> Jena 3.
>>>>
>>>> Virtuoso has its own full-text indexing, which is not as flexible in 
>>>> how
>>>> it indexes as Elastic/Solr/Lucene.   It still works.
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org> wrote:
>>>>
>>>>> Yes, probably - but.
>>>>>
>>>>> The Jena text index will work in conjunction with any (Jena)
>>>>> DatasetGraphAPI implementation. 3rd party systems are not tested in 
>>>>> the
>>>>> build.
>>>>>
>>>>> The "but" is efficiency. Both those systems have their own built-in 
>>>>> text
>>>>> indexing which execute as part of the native query engine. This may 
>>>>> be a
>>>>> factor for you, it may not.
>>>>>
>>>>> Let us know how you get on trying it.
>>>>>
>>>>> ----
>>>>>
>>>>> There is a SPARQL 1.2 issue about standardizing text query.
>>>>>
>>>>> Issue 40 : SPARQL 1.2 Community Group:
>>>>> https://github.com/w3c/sparql-12/issues/40
>>>>>
>>>>>       Andy
>>>>>
>>>>> On 12/09/2019 02:53, Alex To wrote:
>>>>>> Hi
>>>>>>
>>>>>> I have so far been happy with Jena + Lucene / Elastic. Just trying to
>>>>> get a
>>>>>> quick answer whether it can work with other Jena based API like
>>>>> Virtuoso /
>>>>>> MarkLogic.
>>>>>>
>>>>>> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
>>>>>> expected ?
>>>>>>
>>>>>> Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
>>>>>> interface, it may work but I am not sure because the "text:query"
>>> seems
>>>>> to
>>>>>> be more Jena specific.
>>>>>>
>>>>>> I will try out myself in the next couple of days to see if it works
>>> but
>>>>> if
>>>>>> there is a quick answer it may save me a couple of hours :)
>>>>>>
>>>>>> Thank a lot
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>
>>>>
>>>
>>
>>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Alex To <to...@gmail.com>.

Hi Andy

I had to pull develop branch from here
https://github.com/marklogic/marklogic-jena/tree/develop to get the version
that works with Jena 3.1x.0

then update file
https://github.com/marklogic/marklogic-jena/blob/develop/marklogic-jena/build.gradle

with the following

1. Line 9: change version *3.0-SNAPSHOT* to *3.1.0*
2. Line 13: change *3.10.0* to *3.12.0 *

Then do "gradlew install" to install it to my local maven.

On Sat, Sep 14, 2019 at 8:05 AM Andy Seaborne <an...@apache.org> wrote:

The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but our
> code depends on 3.1.0 - what code is it using?
>
> On 13/09/2019 01:18, Alex To wrote:
> > I created a small program to try out Lucene with MarkLogic Jena here
> >
> >
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> >
> >
> > My observation is as follows (see my comment at line 54 & 56)
> >
> > 1. If the model reads a small file with 2 triples, the loading can finish
> > quickly
> > 2. If the model reads a slightly larger file (1.5MB), the loading takes
> > forever so I have to terminate it
>
> Pure speculation but parts 1 & 2 sounds like the data load is not going
> to MarkLogic as a single transaction but as "autocommit" - one
> transaction for each triple added.
>
>      Andy
>
>
> > 3. After loading the small file, searching the Lucene index direct shows
> > that the triples are indexed
> > 4. After loading the small file, run SPARQL query with "text:query" won't
> > finish
> >
> > For now I created 2 separate implementation in my program to support Full
> > Text search with Jena or MarkLogic but I look forward to know more
> whether
> > it is still possible to use Jena Elastic indexing with TextDataset
> because
> > then I can provide a single UI to users to configure their search
> > regardless of the back end. :)
> >
> >
> > On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <da...@gmail.com> wrote:
> >
> >> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
> >> implementation of Dataset, and so while application is only using the
> >> virtuoso.jena.driver.VirtGraph and
> >> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more
> flexible
> >> integration is possible. I look forward to experimenting with it and
> seeing
> >> what I can do on the backend.
> >>
> >> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com> wrote:
> >>
> >>> Virtuoso's Jena driver implements the model interface, rather than the
> >>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
> >>> interface. You can see the architecture at
> >>>
> >>
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv
> .
> >> However,
> >>> Virtuoso has its own full-text indexing, which can be effective. Its
> >> rules
> >>> for translating words into queries is not as flexible as
> >>> lucene/solr/elastic, but it does allow you to specify what should be
> >>> indexed - e.g. which objects from which which data properties in which
> >>> graphs.
> >>>
> >>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code at
> >>> https://github.com/HHS/lodestar, which is run underneath
> >>> https://github.com/HHS/meshrdf.   You will see that
> >>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy
> has
> >>> been updated to Jena 3. The EBI version is ahead on UI features
> however.
> >>>
> >>> I cannot speak to MarkLogic, Stardog, etc.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> EBI's lodestar still uses Jena 2, but the fork at HHS has been updated
> to
> >>> Jena 3.
> >>>
> >>> Virtuoso has its own full-text indexing, which is not as flexible in
> how
> >>> it indexes as Elastic/Solr/Lucene.   It still works.
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org> wrote:
> >>>
> >>>> Yes, probably - but.
> >>>>
> >>>> The Jena text index will work in conjunction with any (Jena)
> >>>> DatasetGraphAPI implementation. 3rd party systems are not tested in
> the
> >>>> build.
> >>>>
> >>>> The "but" is efficiency. Both those systems have their own built-in
> text
> >>>> indexing which execute as part of the native query engine. This may
> be a
> >>>> factor for you, it may not.
> >>>>
> >>>> Let us know how you get on trying it.
> >>>>
> >>>> ----
> >>>>
> >>>> There is a SPARQL 1.2 issue about standardizing text query.
> >>>>
> >>>> Issue 40 : SPARQL 1.2 Community Group:
> >>>> https://github.com/w3c/sparql-12/issues/40
> >>>>
> >>>>       Andy
> >>>>
> >>>> On 12/09/2019 02:53, Alex To wrote:
> >>>>> Hi
> >>>>>
> >>>>> I have so far been happy with Jena + Lucene / Elastic. Just trying to
> >>>> get a
> >>>>> quick answer whether it can work with other Jena based API like
> >>>> Virtuoso /
> >>>>> MarkLogic.
> >>>>>
> >>>>> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
> >>>>> expected ?
> >>>>>
> >>>>> Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
> >>>>> interface, it may work but I am not sure because the "text:query"
> >> seems
> >>>> to
> >>>>> be more Jena specific.
> >>>>>
> >>>>> I will try out myself in the next couple of days to see if it works
> >> but
> >>>> if
> >>>>> there is a quick answer it may save me a couple of hours :)
> >>>>>
> >>>>> Thank a lot
> >>>>>
> >>>>> Regards
> >>>>>
> >>>>
> >>>
> >>
> >
> >
>


-- 

Alex To

PhD Candidate

School of Computer Science

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 <%2B61450061602>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Andy Seaborne <an...@apache.org>.

The maven central artifact com.marklogic:marklogic-jena is 3.0.6 but our 
code depends on 3.1.0 - what code is it using?

On 13/09/2019 01:18, Alex To wrote:
> I created a small program to try out Lucene with MarkLogic Jena here
> 
> https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java
> 
> 
> My observation is as follows (see my comment at line 54 & 56)
> 
> 1. If the model reads a small file with 2 triples, the loading can finish
> quickly
> 2. If the model reads a slightly larger file (1.5MB), the loading takes
> forever so I have to terminate it

Pure speculation but parts 1 & 2 sounds like the data load is not going 
to MarkLogic as a single transaction but as "autocommit" - one 
transaction for each triple added.

     Andy


> 3. After loading the small file, searching the Lucene index direct shows
> that the triples are indexed
> 4. After loading the small file, run SPARQL query with "text:query" won't
> finish
> 
> For now I created 2 separate implementation in my program to support Full
> Text search with Jena or MarkLogic but I look forward to know more whether
> it is still possible to use Jena Elastic indexing with TextDataset because
> then I can provide a single UI to users to configure their search
> regardless of the back end. :)
> 
> 
> On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <da...@gmail.com> wrote:
> 
>> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
>> implementation of Dataset, and so while application is only using the
>> virtuoso.jena.driver.VirtGraph and
>> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more flexible
>> integration is possible. I look forward to experimenting with it and seeing
>> what I can do on the backend.
>>
>> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com> wrote:
>>
>>> Virtuoso's Jena driver implements the model interface, rather than the
>>> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
>>> interface. You can see the architecture at
>>>
>> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv.
>> However,
>>> Virtuoso has its own full-text indexing, which can be effective. Its
>> rules
>>> for translating words into queries is not as flexible as
>>> lucene/solr/elastic, but it does allow you to specify what should be
>>> indexed - e.g. which objects from which which data properties in which
>>> graphs.
>>>
>>> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code at
>>> https://github.com/HHS/lodestar, which is run underneath
>>> https://github.com/HHS/meshrdf.   You will see that
>>> https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy has
>>> been updated to Jena 3. The EBI version is ahead on UI features however.
>>>
>>> I cannot speak to MarkLogic, Stardog, etc.
>>>
>>>
>>>
>>>
>>>
>>> EBI's lodestar still uses Jena 2, but the fork at HHS has been updated to
>>> Jena 3.
>>>
>>> Virtuoso has its own full-text indexing, which is not as flexible in how
>>> it indexes as Elastic/Solr/Lucene.   It still works.
>>>
>>>
>>>
>>>
>>> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org> wrote:
>>>
>>>> Yes, probably - but.
>>>>
>>>> The Jena text index will work in conjunction with any (Jena)
>>>> DatasetGraphAPI implementation. 3rd party systems are not tested in the
>>>> build.
>>>>
>>>> The "but" is efficiency. Both those systems have their own built-in text
>>>> indexing which execute as part of the native query engine. This may be a
>>>> factor for you, it may not.
>>>>
>>>> Let us know how you get on trying it.
>>>>
>>>> ----
>>>>
>>>> There is a SPARQL 1.2 issue about standardizing text query.
>>>>
>>>> Issue 40 : SPARQL 1.2 Community Group:
>>>> https://github.com/w3c/sparql-12/issues/40
>>>>
>>>>       Andy
>>>>
>>>> On 12/09/2019 02:53, Alex To wrote:
>>>>> Hi
>>>>>
>>>>> I have so far been happy with Jena + Lucene / Elastic. Just trying to
>>>> get a
>>>>> quick answer whether it can work with other Jena based API like
>>>> Virtuoso /
>>>>> MarkLogic.
>>>>>
>>>>> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
>>>>> expected ?
>>>>>
>>>>> Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
>>>>> interface, it may work but I am not sure because the "text:query"
>> seems
>>>> to
>>>>> be more Jena specific.
>>>>>
>>>>> I will try out myself in the next couple of days to see if it works
>> but
>>>> if
>>>>> there is a quick answer it may save me a couple of hours :)
>>>>>
>>>>> Thank a lot
>>>>>
>>>>> Regards
>>>>>
>>>>
>>>
>>
> 
>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Alex To <to...@gmail.com>.

I created a small program to try out Lucene with MarkLogic Jena here

https://github.com/AlexTo/jena-lab/blob/master/src/main/java/com/company/MainMarkLogic.java


My observation is as follows (see my comment at line 54 & 56)

1. If the model reads a small file with 2 triples, the loading can finish
quickly
2. If the model reads a slightly larger file (1.5MB), the loading takes
forever so I have to terminate it
3. After loading the small file, searching the Lucene index direct shows
that the triples are indexed
4. After loading the small file, run SPARQL query with "text:query" won't
finish

For now I created 2 separate implementation in my program to support Full
Text search with Jena or MarkLogic but I look forward to know more whether
it is still possible to use Jena Elastic indexing with TextDataset because
then I can provide a single UI to users to configure their search
regardless of the back end. :)


On Fri, Sep 13, 2019 at 1:07 AM Dan Davis <da...@gmail.com> wrote:

> I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
> implementation of Dataset, and so while application is only using the
> virtuoso.jena.driver.VirtGraph and
> virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more flexible
> integration is possible. I look forward to experimenting with it and seeing
> what I can do on the backend.
>
> On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com> wrote:
>
> > Virtuoso's Jena driver implements the model interface, rather than the
> > DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
> > interface. You can see the architecture at
> >
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv.
> However,
> > Virtuoso has its own full-text indexing, which can be effective. Its
> rules
> > for translating words into queries is not as flexible as
> > lucene/solr/elastic, but it does allow you to specify what should be
> > indexed - e.g. which objects from which which data properties in which
> > graphs.
> >
> > I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code at
> > https://github.com/HHS/lodestar, which is run underneath
> > https://github.com/HHS/meshrdf.   You will see that
> > https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy has
> > been updated to Jena 3. The EBI version is ahead on UI features however.
> >
> > I cannot speak to MarkLogic, Stardog, etc.
> >
> >
> >
> >
> >
> > EBI's lodestar still uses Jena 2, but the fork at HHS has been updated to
> > Jena 3.
> >
> > Virtuoso has its own full-text indexing, which is not as flexible in how
> > it indexes as Elastic/Solr/Lucene.   It still works.
> >
> >
> >
> >
> > On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org> wrote:
> >
> >> Yes, probably - but.
> >>
> >> The Jena text index will work in conjunction with any (Jena)
> >> DatasetGraphAPI implementation. 3rd party systems are not tested in the
> >> build.
> >>
> >> The "but" is efficiency. Both those systems have their own built-in text
> >> indexing which execute as part of the native query engine. This may be a
> >> factor for you, it may not.
> >>
> >> Let us know how you get on trying it.
> >>
> >> ----
> >>
> >> There is a SPARQL 1.2 issue about standardizing text query.
> >>
> >> Issue 40 : SPARQL 1.2 Community Group:
> >> https://github.com/w3c/sparql-12/issues/40
> >>
> >>      Andy
> >>
> >> On 12/09/2019 02:53, Alex To wrote:
> >> > Hi
> >> >
> >> > I have so far been happy with Jena + Lucene / Elastic. Just trying to
> >> get a
> >> > quick answer whether it can work with other Jena based API like
> >> Virtuoso /
> >> > MarkLogic.
> >> >
> >> > If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
> >> > expected ?
> >> >
> >> > Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
> >> > interface, it may work but I am not sure because the "text:query"
> seems
> >> to
> >> > be more Jena specific.
> >> >
> >> > I will try out myself in the next couple of days to see if it works
> but
> >> if
> >> > there is a quick answer it may save me a couple of hours :)
> >> >
> >> > Thank a lot
> >> >
> >> > Regards
> >> >
> >>
> >
>


-- 

Alex To

PhD Candidate

School of Computer Science

Knowledge Discovery and Management Research Group

Faculty of Engineering & IT

THE UNIVERSITY OF SYDNEY | NSW | 2006

Desk 4e69 | Building J12| 1 Cleveland Street

M. +61423330656 <%2B61450061602>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Dan Davis <da...@gmail.com>.

I am incorrect, and apologize. Virtuoso's Jena 3 driver includes an
implementation of Dataset, and so while application is only using the
virtuoso.jena.driver.VirtGraph and
virtuoso.jena.driver.VirtuosoQueryExecution (and factory), a more flexible
integration is possible. I look forward to experimenting with it and seeing
what I can do on the backend.

On Thu, Sep 12, 2019 at 10:19 AM Dan Davis <da...@gmail.com> wrote:

> Virtuoso's Jena driver implements the model interface, rather than the
> DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
> interface. You can see the architecture at
> http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv. However,
> Virtuoso has its own full-text indexing, which can be effective. Its rules
> for translating words into queries is not as flexible as
> lucene/solr/elastic, but it does allow you to specify what should be
> indexed - e.g. which objects from which which data properties in which
> graphs.
>
> I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code at
> https://github.com/HHS/lodestar, which is run underneath
> https://github.com/HHS/meshrdf.   You will see that
> https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy has
> been updated to Jena 3. The EBI version is ahead on UI features however.
>
> I cannot speak to MarkLogic, Stardog, etc.
>
>
>
>
>
> EBI's lodestar still uses Jena 2, but the fork at HHS has been updated to
> Jena 3.
>
> Virtuoso has its own full-text indexing, which is not as flexible in how
> it indexes as Elastic/Solr/Lucene.   It still works.
>
>
>
>
> On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org> wrote:
>
>> Yes, probably - but.
>>
>> The Jena text index will work in conjunction with any (Jena)
>> DatasetGraphAPI implementation. 3rd party systems are not tested in the
>> build.
>>
>> The "but" is efficiency. Both those systems have their own built-in text
>> indexing which execute as part of the native query engine. This may be a
>> factor for you, it may not.
>>
>> Let us know how you get on trying it.
>>
>> ----
>>
>> There is a SPARQL 1.2 issue about standardizing text query.
>>
>> Issue 40 : SPARQL 1.2 Community Group:
>> https://github.com/w3c/sparql-12/issues/40
>>
>>      Andy
>>
>> On 12/09/2019 02:53, Alex To wrote:
>> > Hi
>> >
>> > I have so far been happy with Jena + Lucene / Elastic. Just trying to
>> get a
>> > quick answer whether it can work with other Jena based API like
>> Virtuoso /
>> > MarkLogic.
>> >
>> > If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
>> > expected ?
>> >
>> > Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
>> > interface, it may work but I am not sure because the "text:query" seems
>> to
>> > be more Jena specific.
>> >
>> > I will try out myself in the next couple of days to see if it works but
>> if
>> > there is a quick answer it may save me a couple of hours :)
>> >
>> > Thank a lot
>> >
>> > Regards
>> >
>>
>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Dan Davis <da...@gmail.com>.

Virtuoso's Jena driver implements the model interface, rather than the
DatasetGraphAPI.  is translating the SPARQL query into its own JDBC
interface. You can see the architecture at
http://docs.openlinksw.com/virtuoso/rdfnativestorageprovidersjena/#rdfnativestorageprovidersjenawhatisv.
However,
Virtuoso has its own full-text indexing, which can be effective. Its rules
for translating words into queries is not as flexible as
lucene/solr/elastic, but it does allow you to specify what should be
indexed - e.g. which objects from which which data properties in which
graphs.

I use Virtuoso behind virt_jena and virt_jdbc.  You can see the code at
https://github.com/HHS/lodestar, which is run underneath
https://github.com/HHS/meshrdf.   You will see that
https://github.com/HHS/lodestar is a fork from EBI, but the NLM copy has
been updated to Jena 3. The EBI version is ahead on UI features however.

I cannot speak to MarkLogic, Stardog, etc.

EBI's lodestar still uses Jena 2, but the fork at HHS has been updated to
Jena 3.

Virtuoso has its own full-text indexing, which is not as flexible in how it
indexes as Elastic/Solr/Lucene.   It still works.

On Thu, Sep 12, 2019 at 7:03 AM Andy Seaborne <an...@apache.org> wrote:

> Yes, probably - but.
>
> The Jena text index will work in conjunction with any (Jena)
> DatasetGraphAPI implementation. 3rd party systems are not tested in the
> build.
>
> The "but" is efficiency. Both those systems have their own built-in text
> indexing which execute as part of the native query engine. This may be a
> factor for you, it may not.
>
> Let us know how you get on trying it.
>
> ----
>
> There is a SPARQL 1.2 issue about standardizing text query.
>
> Issue 40 : SPARQL 1.2 Community Group:
> https://github.com/w3c/sparql-12/issues/40
>
>      Andy
>
> On 12/09/2019 02:53, Alex To wrote:
> > Hi
> >
> > I have so far been happy with Jena + Lucene / Elastic. Just trying to
> get a
> > quick answer whether it can work with other Jena based API like Virtuoso
> /
> > MarkLogic.
> >
> > If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
> > expected ?
> >
> > Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
> > interface, it may work but I am not sure because the "text:query" seems
> to
> > be more Jena specific.
> >
> > I will try out myself in the next couple of days to see if it works but
> if
> > there is a quick answer it may save me a couple of hours :)
> >
> > Thank a lot
> >
> > Regards
> >
>

Re: Can Jena Full Text search work with other Jena based API like Virtuoso Jena or MarkLogic Jena ?

Posted by Andy Seaborne <an...@apache.org>.

Yes, probably - but.

The Jena text index will work in conjunction with any (Jena) 
DatasetGraphAPI implementation. 3rd party systems are not tested in the 
build.

The "but" is efficiency. Both those systems have their own built-in text 
indexing which execute as part of the native query engine. This may be a 
factor for you, it may not.

Let us know how you get on trying it.

----

There is a SPARQL 1.2 issue about standardizing text query.

Issue 40 : SPARQL 1.2 Community Group:
https://github.com/w3c/sparql-12/issues/40

     Andy

On 12/09/2019 02:53, Alex To wrote:
> Hi
> 
> I have so far been happy with Jena + Lucene / Elastic. Just trying to get a
> quick answer whether it can work with other Jena based API like Virtuoso /
> MarkLogic.
> 
> If I wrap a MarkLogic Dataset in a Jena TextDataset, can it work as
> expected ?
> 
> Given that a MarkLogic / Virtuoso Dataset implements Jena Dataset
> interface, it may work but I am not sure because the "text:query" seems to
> be more Jena specific.
> 
> I will try out myself in the next couple of days to see if it works but if
> there is a quick answer it may save me a couple of hours :)
> 
> Thank a lot
> 
> Regards
>