You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by Marcel Ferrante <ma...@gmail.com> on 2012/02/22 20:01:35 UTC
Adding Semantics to WORDPRESS with JENA
Hi everyone,
we are developing a plugin that wants add semantics to wordpress CMS,
and we are thing about integrate jena with wordpress to store the metadata.
For example, we have a wordpress MU of music artistis. The artistic, musics,
albuns, etc are represent with posts. So custom post is similar a "class",
the post simiiliar a instance, and custom fields similiar a atributes.
Here we can ser more about custom posts:
http://vimeo.com/32661608 http://vimeo.com/10187055
And here we can see the wordpress ER:
http://www.inqbation.com/wp-content/uploads/2011/05/WP3.0-ERD.png
So we can aloud user to create semantic relations beetween posts, links,
terms...any resource in wordpress.
We acctually created a oblivous triple table to store the statments like
this:
+----+---wp_posts---+-----------------------+
+-id-+---post_title---+---post_author---+
| 11 | car | 12 |
| 14 | fusca | 32 |
| 23 | my fusca | 43 |
+----+---wp_images----+-----------------+
+-id-+---------title--------+-----url---------+
| 9 | fusca photo1 | ../fusca1.jpg |
+----+-wp_nodes-+-------------------------------------------------+-literal-+
+-id-+---wp_id-----+-----value-------------------------------------+---------+
| 1 | 11 | posts | 0 |
| 2 | 14 | posts | 0 |
| 3 | 23 | posts | 0 |
| 4 | 9 | images | 0 |
| 5 | | http://purl.org/dc/terms/creator | 0 |
| 6 | | http://www.w3.org/2000/01/rdf-schema#subClassOf | 0 |
| 7 | | http://www.w3.org/1999/02/22-rdf-syntax-ns#type |
0 |
| 8 | | has_photo | 0 |
| 9 | | mileage | 0 |
| 10 | | 1981 | 1 |
| 11 | 43 | users | 0 |
when wp_id not null, then value is a wordpress resource (table or
table-field)
when literal equal 1, then value is literal
+----+--wp_spo---+-----+
+-id-+--s--+--p---+--o--+
| 1 | 2 | 6 | 1 |
| 1 | 3 | 7 | 2 |
| 1 | 3 | 5 | 11 |
| 1 | 3 | 8 | 4 |
| 1 | 3 | 9 | 10 |
The question is:
1.has this model a good performance ?
2.how store this using Jena SDB?
The new documentation is incomplete (dont have an explanation for each
field, datatype, and how works) :
http://incubator.apache.org/jena/documentation/sdb/database_layouts.html
This one is more detailed:
http://jena.sourceforge.net/DB/layout.html
But its the old one.
So, how does jena sdb store namespace, literals, graphs?
Id like any documentation, could be an article, that explain why this model
is better than previous models (like in Efficient RDF Storage and Retrieval
in Jena2 article)
Maybe someone could send a mysql dump with a litle rdf graph stored.
If we dont use, it could help us to improve our model.
This could be used in many applications like:
- a simple semantic annoations tool with a familiar interface
- semantic portals or knowledge management system
- rdf explorer (to wiki.dbpedia.org/Downloads for example)
You could say, but there is already many semantic tools.
Ye, but which one has more than 70 million sites around the world ?
http://en.wordpress.com/stats/
Should I send this issue to dev mailling list ?
Thanks in advance,
Marcel
--
Marcel Ferrante Silva
"The Power of Ideas"
skype: marcelferrante
msn/gtalk: marcelf@gmail.com
Re: Adding Semantics to WORDPRESS with JENA
Posted by Andy Seaborne <an...@apache.org>.
On 23/02/12 11:59, Marcel Ferrante wrote:
> Hi Andy,
>
> It`s me again ;)
So it is :-)
>
>
>> I don't think I understand this -
>>
>> 2 6 1 seems to be posts subclassof posts
>>
>> but shouldn't it be the same S and O id?
>>
>
> You are right ! Excluse-me. I miss the g column.
It's not the g col I'm pointing out - you seem to have two "posts" with
different ids. That's going to get confusing (i.e. wrong) in RDF.
>
> +----+--wp_spo----------+-----+
> +-id-+--g--+--s--+--p---+--o--+
> | 1 | 1 | 2 | 6 | 1 |
> | 2 | 1 | 13 | 7 | 2 |
> | 3 | 1 | 13 | 5 | 11 |
>
> And we could include hash and lang in wp_nodes table:
>
> +----+-wp_nodes-+-------------------+----------+---------+--hash--+
> +-id-+--wp_id--+----value-----------+-literal-+--lang--+--hash--+
> | 1 | 11 | posts | 0 | 1 | 23423 +
> | 2 | 14 | posts | 0 | 1 | 54523 +
>
>
>
>> 3 7 2 seems to be posts rdf:type posts
>> 3 9 10 seems to be posts mileage 19811
>>
>> so I guess I haven't guessed the foreign key relationships correctly.
>>
>> The second triple means:
>
> <rdf:description about=" http://www.sellcars.com.br/wp#my-fusca">
> <mileage>19811</mileage>
> </rdf:description>
>
> The wp_nodes is a bridge to wordpress resources: wp_id is the id of the
> original table.
> In wp_nodes we store all: wp resources, external resources and literals...
>
> Doubt: is better (for performance) separate in two different tables (like
> wp_resources and wp_literals) ?
Probably makes no difference. If you do split them, it is more
complicated as you have to know which id refers to which tables. id=789
maybe a literal or a URI and to find it you may end up looking in both.
Andy
>
> Thanks
> Marcel
>
Re: Adding Semantics to WORDPRESS with JENA
Posted by Marcel Ferrante <ma...@gmail.com>.
Hi Andy,
It`s me again ;)
> I don't think I understand this -
>
> 2 6 1 seems to be posts subclassof posts
>
> but shouldn't it be the same S and O id?
>
You are right ! Excluse-me. I miss the g column.
+----+--wp_spo----------+-----+
+-id-+--g--+--s--+--p---+--o--+
| 1 | 1 | 2 | 6 | 1 |
| 2 | 1 | 13 | 7 | 2 |
| 3 | 1 | 13 | 5 | 11 |
And we could include hash and lang in wp_nodes table:
+----+-wp_nodes-+-------------------+----------+---------+--hash--+
+-id-+--wp_id--+----value-----------+-literal-+--lang--+--hash--+
| 1 | 11 | posts | 0 | 1 | 23423 +
| 2 | 14 | posts | 0 | 1 | 54523 +
> 3 7 2 seems to be posts rdf:type posts
> 3 9 10 seems to be posts mileage 19811
>
> so I guess I haven't guessed the foreign key relationships correctly.
>
> The second triple means:
<rdf:description about=" http://www.sellcars.com.br/wp#my-fusca">
<mileage>19811</mileage>
</rdf:description>
The wp_nodes is a bridge to wordpress resources: wp_id is the id of the
original table.
In wp_nodes we store all: wp resources, external resources and literals...
Doubt: is better (for performance) separate in two different tables (like
wp_resources and wp_literals) ?
Thanks
Marcel
--
Marcel Ferrante Silva
"The Power of Ideas"
skype: marcelferrante
msn/gtalk: marcelf@gmail.com
Re: Adding Semantics to WORDPRESS with JENA
Posted by Andy Seaborne <an...@apache.org>.
On 22/02/12 19:01, Marcel Ferrante wrote:
> We acctually created a oblivous triple table to store the statments like
> this:
> +----+---wp_posts---+-----------------------+
> +-id-+---post_title---+---post_author---+
> | 11 | car | 12 |
> | 14 | fusca | 32 |
> | 23 | my fusca | 43 |
>
>
> +----+---wp_images----+-----------------+
> +-id-+---------title--------+-----url---------+
> | 9 | fusca photo1 | ../fusca1.jpg |
>
>
> +----+-wp_nodes-+-------------------------------------------------+-literal-+
> +-id-+---wp_id-----+-----value-------------------------------------+---------+
> | 1 | 11 | posts | 0 |
> | 2 | 14 | posts | 0 |
> | 3 | 23 | posts | 0 |
> | 4 | 9 | images | 0 |
> | 5 | | http://purl.org/dc/terms/creator | 0 |
> | 6 | | http://www.w3.org/2000/01/rdf-schema#subClassOf | 0 |
> | 7 | | http://www.w3.org/1999/02/22-rdf-syntax-ns#type |
> 0 |
> | 8 | | has_photo | 0 |
> | 9 | | mileage | 0 |
> | 10 | | 1981 | 1 |
> | 11 | 43 | users | 0 |
>
> when wp_id not null, then value is a wordpress resource (table or
> table-field)
> when literal equal 1, then value is literal
>
> +----+--wp_spo---+-----+
> +-id-+--s--+--p---+--o--+
> | 1 | 2 | 6 | 1 |
> | 1 | 3 | 7 | 2 |
> | 1 | 3 | 5 | 11 |
> | 1 | 3 | 8 | 4 |
> | 1 | 3 | 9 | 10 |
I don't think I understand this -
2 6 1 seems to be posts subclassof posts
but shouldn't it be the same S and O id?
3 7 2 seems to be posts rdf:type posts
3 9 10 seems to be posts mileage 19811
so I guess I haven't guessed the foreign key relationships correctly.
(it got rather damaged by email as well)
Andy
Re: Adding Semantics to WORDPRESS with JENA
Posted by Andy Seaborne <an...@apache.org>.
On 22/02/12 19:01, Marcel Ferrante wrote:
> The question is:
> 1.has this model a good performance ?
> 2.how store this using Jena SDB?
You should consider using use SDB through the API. This would mean yo
don't have to worry about the internal details. Maybe even store the
details in a separate store and use SPARQL to access it (e.g. Fuseki as
database layer).
> The new documentation is incomplete (dont have an explanation for each
> field, datatype, and how works) :
The G/S/P/O are ids (Id or Hash, depending on choice) in to the node table.
> http://incubator.apache.org/jena/documentation/sdb/database_layouts.html
The exact layout details can be found in the source code for the
particular database you wish to use. slight variation for differnt
engines.
Andy