You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Jomari Peterson <jo...@andrew.cmu.edu> on 2013/01/17 22:20:45 UTC

Union of TDB named graphs for SPARQL Querying

Good Day,
I have several different files that are stored in separate TDB folders. I
want to bring them all together to create a model that is easier to search.
I currently know how to name each of the TDB folders as named graphs, but I
believe my code is extremely inefficient. I would appreciate any advice on
how I can make this more efficient or use some of the statements that are
already available that will make things more straightforward. I have added
what I am doing currently. It works, but seems like a workaround.  I
appreciate your time and assistance.

                    String fileNameOrUri = "TDBStore/Test0/triples0000.nt";
		    String fileNameOrUri1 = "TDBStore/Test1/triples0001.nt";
		    String directory0 = "TDBStore/Test0" ;
		    String directory1 = "TDBStore/Test1" ;
		
		  // To read quad
	       * TDB.getContext().set(TDB.symUnionDefaultGraph, true);*
	        Dataset dataset0 = TDBFactory.createDataset(directory0) ;
	        Dataset dataset1 = TDBFactory.createDataset(directory1) ;
	       * Model model0= dataset0.getNamedModel("urn:x-arq:UnionGraph");
	        Model model1= dataset1.getNamedModel("urn:x-arq:UnionGraph");*
		
		
	*	   //merge the graphs
		   Model modelf = model0.union(model1);*
		
		   Jomari Peterson
"Creating the Context for Miracles"
707-373-1093

Re: Union of TDB named graphs for SPARQL Querying

Posted by Jomari Peterson <jo...@gmail.com>.
Thank you for the reply. I just want to clarify my situation/issue.

The issue for me is that I am planning on using the Freebase RDF dump, but
I want to test my work on smaller sets to verify my thought process and
code. In addition, I am using the Basekb version of the dump that is in
1000+ different files of triples. The only way to make that into one
database would be to merge them together in one file, correct? I am unsure
how to reconcile that all together and still maintain a set of data that is
reasonable and effective to query.

I have definitely read the documentation on the site and outside sources,
but due to my relatively newness, I am sometimes clear on the terminology
and implications of decisions. E.g. since a TDB has to be in one folder
only, does that mean that TDB storage only represents one file of triples
or can in index multiple files of Triples into one TDB storage/index. If
so, how would one do that?

I really appreciate the reply and assistance. By the way, I find the work
you guys are doing phenomenal. I hope one day to be able to contribute in
some form or another.

Jomari Peterson
"Creating the Context for Miracles"
707-373-1093
On Jan 18, 2013 3:45 AM, "Andy Seaborne" <an...@apache.org> wrote:

> On 17/01/13 21:20, Jomari Peterson wrote:
>
>> Good Day,
>> I have several different files that are stored in separate TDB folders. I
>> want to bring them all together to create a model that is easier to
>> search.
>> I currently know how to name each of the TDB folders as named graphs, but
>> I
>> believe my code is extremely inefficient. I would appreciate any advice on
>> how I can make this more efficient or use some of the statements that are
>> already available that will make things more straightforward. I have added
>> what I am doing currently. It works, but seems like a workaround.  I
>> appreciate your time and assistance.
>>
>>                      String fileNameOrUri = "TDBStore/Test0/triples0000.*
>> *nt";
>>                     String fileNameOrUri1 = "TDBStore/Test1/triples0001.*
>> *nt";
>>                     String directory0 = "TDBStore/Test0" ;
>>                     String directory1 = "TDBStore/Test1" ;
>>
>>                   // To read quad
>>                * TDB.getContext().set(TDB.**symUnionDefaultGraph, true);*
>>                 Dataset dataset0 = TDBFactory.createDataset(**directory0)
>> ;
>>                 Dataset dataset1 = TDBFactory.createDataset(**directory1)
>> ;
>>                * Model model0= dataset0.getNamedModel("urn:x-**
>> arq:UnionGraph");
>>                 Model model1= dataset1.getNamedModel("urn:x-**
>> arq:UnionGraph");*
>>
>>
>>         *          //merge the graphs
>>                    Model modelf = model0.union(model1);*
>>
>>                    Jomari Peterson
>>
>
> Does each TDB dataset contain one single graph or many?
>
> It's only going to be more efficient if you can put all the data into one
> database.
>
> If you can, put all the graphs into separate named graphs in one database
> and use "urn:x-arq:UnionGraph".
>
>         Andy
>
>

Re: Union of TDB named graphs for SPARQL Querying

Posted by Andy Seaborne <an...@apache.org>.
On 17/01/13 21:20, Jomari Peterson wrote:
> Good Day,
> I have several different files that are stored in separate TDB folders. I
> want to bring them all together to create a model that is easier to search.
> I currently know how to name each of the TDB folders as named graphs, but I
> believe my code is extremely inefficient. I would appreciate any advice on
> how I can make this more efficient or use some of the statements that are
> already available that will make things more straightforward. I have added
> what I am doing currently. It works, but seems like a workaround.  I
> appreciate your time and assistance.
>
>                      String fileNameOrUri = "TDBStore/Test0/triples0000.nt";
> 		    String fileNameOrUri1 = "TDBStore/Test1/triples0001.nt";
> 		    String directory0 = "TDBStore/Test0" ;
> 		    String directory1 = "TDBStore/Test1" ;
> 		
> 		  // To read quad
> 	       * TDB.getContext().set(TDB.symUnionDefaultGraph, true);*
> 	        Dataset dataset0 = TDBFactory.createDataset(directory0) ;
> 	        Dataset dataset1 = TDBFactory.createDataset(directory1) ;
> 	       * Model model0= dataset0.getNamedModel("urn:x-arq:UnionGraph");
> 	        Model model1= dataset1.getNamedModel("urn:x-arq:UnionGraph");*
> 		
> 		
> 	*	   //merge the graphs
> 		   Model modelf = model0.union(model1);*
> 		
> 		   Jomari Peterson

Does each TDB dataset contain one single graph or many?

It's only going to be more efficient if you can put all the data into 
one database.

If you can, put all the graphs into separate named graphs in one 
database and use "urn:x-arq:UnionGraph".

	Andy