You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2012/11/05 17:36:25 UTC

Using TDB to Compliment MarkLogic

Hi All,

Currently I have a stack of XML documents in MarkLogic. They get there
via an XProc pipeline. I am currently working to run Apache Any23 on
the XML *just* before it get inserted into MarkLogic. I would like to
then send the extracted structure e.g. triples, etc to TDB and use
this structure to compliment structured or text based queries within
my search application.

Currently I need clarification on a couple of areas if possible...

1. The triples can be easily extracted then written as a
ByteArrayOutputStream (or a Sting representation of this Stream), and
I assume this can be fed into TDB?

2. If this above is achievable... how exactly would TDB persist the
Stream? Would this be one graph? Can someone please expand on this?

3. How would I synchronize the XML documents and the associated
content within TDB? This is my major area of confusion. I accept that
this is not in any way, shape or form related to TDB/Jena... but I am
curious to hear from anyone out there who has attempted anything
similar.

The idea is to compliment structured or text based queries into a
search application and have the relevant TDB content compliment the
users search... something like the provision of domain/document
specific metadata to give more body to the search experience.

Thanks very much for any feedback on this one, I realise it is a
pretty lengthy question but any suggestions would be great.

All the best

Lewis

-- 
Lewis