You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Immanuel Normann <im...@gmail.com> on 2020/02/07 09:02:24 UTC

loading data with tdbloader for Fuseki

Hi there,
I have Fuseki 3.14 running inside my Tomcat 9 and installed
apache-jena-3.14.
Now I am trying to work with tdbloader2 to create a database from ttl-files
for Fuseki:

bin/tdbloader2 --loc mydb mydata.ttl
This generates a directory mydb with all the indexes. My question is: how
do I make Fuseki aware of my newly created mydb?

My idea was to simply move it into /etc/fuseki/databases where all other
data indexes are located and restart Fuseki. But that is apparently not
enough: The Fuseki WebGUI does not list mydb , where as it lists all other
datasets which I have created via the Fuseki WebGUI.

How do I notify Fuseki about my freshly created index via tbloader2 without
GUI interaction?

Regards,
Immanuel

Re: loading data with tdbloader for Fuseki

Posted by Jean-Claude Moissinac <je...@telecom-paristech.fr>.
Perhaps, my way to create databases (located where i need to is) could be
useful
* first, I create interactively a dataset in the Fuseki UI (Persistent
database),
say mybase
* then, I move the directory from run/databases/mybase to the place
convenient for me
* finally, I change the tdb:location item in the file
run/configuration/mybase.ttl, to point to the place where the dataset lives
now
Better suggestions welcome

--
Jean-Claude Moissinac



Le ven. 7 févr. 2020 à 12:26, Andy Seaborne <an...@apache.org> a écrit :

>
>
> On 07/02/2020 09:02, Immanuel Normann wrote:
> > Hi there,
> > I have Fuseki 3.14 running inside my Tomcat 9 and installed
> > apache-jena-3.14.
> > Now I am trying to work with tdbloader2 to create a database from
> ttl-files
> > for Fuseki:
> >
> > bin/tdbloader2 --loc mydb mydata.ttl
> > This generates a directory mydb with all the indexes. My question is: how
> > do I make Fuseki aware of my newly created mydb?
> >
> > My idea was to simply move it into /etc/fuseki/databases where all other
> > data indexes are located and restart Fuseki. But that is apparently not
> > enough: The Fuseki WebGUI does not list mydb , where as it lists all
> other
> > datasets which I have created via the Fuseki WebGUI.
> >
> > How do I notify Fuseki about my freshly created index via tbloader2
> without
> > GUI interaction?
>
> Put the database, .../run/database and a config file in
> ../run/configuration and restart the server.
>
> ----
>
> If you allow a GUI action, create the database in the eventual location,
> then create the dataset in the UI - it will pick up the prebuilt
> database, no restart required.
>
>    # Initially ds2 must not exist
>    tdb2.tdbloader --loc run/databases/ds2 ...datafiles...
>    <create ds2 in the interface>
>
> Fuseki and any tdbloader can both be using the database at the same
> time.  But in the pattern above, the database is created while Fuseki is
> unaware of it, then the Fuseki connects to it.
>
> In fact, the UI is POSTing a request to the server to create the dataset
> but that's not documented properly.
>
> If you have a TDB2 database in Fuseki
>    (confusion alert: tdbloader2 create TDB1 databases - legacy)
> you can, instead, directly load the data into Fuseki with the server
> running. It is slower to load at scale than the bulk loader, but there
> are no other steps so it may still be faster overall.
>
> If you have that much data, and the machine has fast I/O
>
> tdb2.tdbloader --loader=parallel ....
>
> (there are other --loader options)
>
> There is no hard-and-fast rule which loader is faster. It is hardware
> dependent.
>
>      Andy
>
> >
> > Regards,
> > Immanuel
> >
>

Re: loading data with tdbloader for Fuseki

Posted by Andy Seaborne <an...@apache.org>.

On 07/02/2020 09:02, Immanuel Normann wrote:
> Hi there,
> I have Fuseki 3.14 running inside my Tomcat 9 and installed
> apache-jena-3.14.
> Now I am trying to work with tdbloader2 to create a database from ttl-files
> for Fuseki:
> 
> bin/tdbloader2 --loc mydb mydata.ttl
> This generates a directory mydb with all the indexes. My question is: how
> do I make Fuseki aware of my newly created mydb?
> 
> My idea was to simply move it into /etc/fuseki/databases where all other
> data indexes are located and restart Fuseki. But that is apparently not
> enough: The Fuseki WebGUI does not list mydb , where as it lists all other
> datasets which I have created via the Fuseki WebGUI.
> 
> How do I notify Fuseki about my freshly created index via tbloader2 without
> GUI interaction?

Put the database, .../run/database and a config file in 
../run/configuration and restart the server.

----

If you allow a GUI action, create the database in the eventual location, 
then create the dataset in the UI - it will pick up the prebuilt 
database, no restart required.

   # Initially ds2 must not exist
   tdb2.tdbloader --loc run/databases/ds2 ...datafiles...
   <create ds2 in the interface>

Fuseki and any tdbloader can both be using the database at the same 
time.  But in the pattern above, the database is created while Fuseki is 
unaware of it, then the Fuseki connects to it.

In fact, the UI is POSTing a request to the server to create the dataset 
but that's not documented properly.

If you have a TDB2 database in Fuseki
   (confusion alert: tdbloader2 create TDB1 databases - legacy)
you can, instead, directly load the data into Fuseki with the server 
running. It is slower to load at scale than the bulk loader, but there 
are no other steps so it may still be faster overall.

If you have that much data, and the machine has fast I/O

tdb2.tdbloader --loader=parallel ....

(there are other --loader options)

There is no hard-and-fast rule which loader is faster. It is hardware 
dependent.

     Andy

> 
> Regards,
> Immanuel
>