You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by jason <gi...@gmail.com> on 2006/01/17 15:34:33 UTC

Use the lucene for searching in the Semantic Web.

Hi friends,

How do you think use the lucene for searching in the Semantic Web? I am
trying using the lucene for searching documents with ontological annotation.
But i do not get a better model to combine the keywords information and the
ontological information.

regards
jiang xing

Re: Use the lucene for searching in the Semantic Web.

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jan 17, 2006, at 12:25 PM, jason wrote:
> I think the Kowari is a system for searching information in the RDF  
> files.
> It is only for finding information in the meta data files. However,  
> i think
> one problem of the Semantic Web is that, if we have a document and  
> its RDF
> annotate, how do we retrieve the documents? Right now, we can use  
> keyword
> based method to find relevant documents to user's query and use  
> some kinds
> of technologies for finding metadata files. But can we combine the two
> processes and how can we combine them?

It's not quite true that Kowari only deals with RDF.  It can load and  
parse HTML, for example, and load it directly into a LuceneModel for  
full-text indexing.  You can then query structured information in  
conjunction with full-text Lucene queries.  To get other type of  
content in, you would need to extend Kowari or hand in literal text,  
but it could be done.

	Erik


>
> regards
> Jiang Xing
>
>
> On 1/17/06, Erik Hatcher <er...@ehatchersolutions.com> wrote:
>>
>> Have a look at Kowari - http://www.kowari.org
>>
>> It is a scalable RDF engine that also has full-text search support
>> via Lucene.
>>
>> Professionally I tinker with semweb and search topics, and eventually
>> we'll have something to show for these efforts :)
>>
>>        Erik
>>
>>
>> On Jan 17, 2006, at 9:34 AM, jason wrote:
>>
>>> Hi friends,
>>>
>>> How do you think use the lucene for searching in the Semantic Web?
>>> I am
>>> trying using the lucene for searching documents with ontological
>>> annotation.
>>> But i do not get a better model to combine the keywords information
>>> and the
>>> ontological information.
>>>
>>> regards
>>> jiang xing
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Use the lucene for searching in the Semantic Web.

Posted by jason <gi...@gmail.com>.
hi Erik,

thx for your reply.

I think the Kowari is a system for searching information in the RDF files.
It is only for finding information in the meta data files. However, i think
one problem of the Semantic Web is that, if we have a document and its RDF
annotate, how do we retrieve the documents? Right now, we can use keyword
based method to find relevant documents to user's query and use some kinds
of technologies for finding metadata files. But can we combine the two
processes and how can we combine them?

regards
Jiang Xing


On 1/17/06, Erik Hatcher <er...@ehatchersolutions.com> wrote:
>
> Have a look at Kowari - http://www.kowari.org
>
> It is a scalable RDF engine that also has full-text search support
> via Lucene.
>
> Professionally I tinker with semweb and search topics, and eventually
> we'll have something to show for these efforts :)
>
>        Erik
>
>
> On Jan 17, 2006, at 9:34 AM, jason wrote:
>
> > Hi friends,
> >
> > How do you think use the lucene for searching in the Semantic Web?
> > I am
> > trying using the lucene for searching documents with ontological
> > annotation.
> > But i do not get a better model to combine the keywords information
> > and the
> > ontological information.
> >
> > regards
> > jiang xing
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Use the lucene for searching in the Semantic Web.

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
Have a look at Kowari - http://www.kowari.org

It is a scalable RDF engine that also has full-text search support  
via Lucene.

Professionally I tinker with semweb and search topics, and eventually  
we'll have something to show for these efforts :)

	Erik


On Jan 17, 2006, at 9:34 AM, jason wrote:

> Hi friends,
>
> How do you think use the lucene for searching in the Semantic Web?  
> I am
> trying using the lucene for searching documents with ontological  
> annotation.
> But i do not get a better model to combine the keywords information  
> and the
> ontological information.
>
> regards
> jiang xing


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Use the lucene for searching in the Semantic Web.

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
For some semweb + full-text searching real-world examples, also look  
to the SIMILE project - http://simile.mit.edu/

They have integrated Lucene into PiggyBank and Longwell.

	Erik



On Jan 18, 2006, at 9:30 PM, xing jiang wrote:

> Hi,
>
> I have done some surveys about the information retrieval on the  
> Semantic
> Web, (maybe i miss many papers, most papers i used are published in  
> recent
> WWW and CIKM conferences, :).
>
> 1. A typical way of using the ontology is to select exact term from  
> the
> domain ontology to form queries. The first one may be the OntoSeek (
> www.loa-cnr.it/Papers/OntoSeek.pdf ).  Similar work may be Latifur  
> Khan's
> work ("Retrieval effectiveness of an ontology-based model for  
> information
> selection", VLDB 2004).
>
> 2. Guha et al. ("Semantic Search", WWW2003) used domain ontology to  
> form a
> concept graph. Then, users only need to browse the concept graph  
> egenerated.
> Similar work may be Eero Hyvonen's  work "MuseumFinland". They all  
> used the
> semantic structure of the domain ontology to help users browsing.
>
> 3. QuizRDF (www.cs.rutgers.edu/~shklar/www11/final_submissions/ 
> paper6.pdf) used
> another kind of method for using the domain ontology.  Klaus, I  
> think your
> method  should be better than QuizRDF.
>
> One interesting method i found is Roha's work ("A hybrid approach for
> searching in the Semantic Web", WWW2004). They still used keyword  
> based
> method for retrieving documents on the Semantic Web. But i cannot  
> find any
> more information about their work and the application i am building  
> can be
> seen an extension of their work.
>
> Actually, the swoogle focuses on the ontology level's files only.  
> It will
> crawls RDF, OWL & DAML files. But they do not provide any new  
> method to
> combine the traditional keyword method for searching the text  
> files. Li Ding
> used a variant of page rank method for ontology files. But i am not  
> sure
> this method can be combined with the page rank method.
>
> Maybe i have missed too many things when i do this survey. However,  
> I think
> we may can find some good new methods of using the domain ontology  
> in the
> Semantic Web.
>
> Yours truly,
> Jiang Xing
>
>
>
>
> On 1/19/06, Klaus <kl...@vommond.de> wrote:
>>
>> Hello,
>>
>>
>>> Hi,
>>
>>> I think one problem of the existing method is that, to query on  
>>> the RDF
>>> files or similar structures, we have to form SQL like queries.  
>>> However,
>> for
>>> searching in the text files, we only need to type several  
>>> keywords. Can
>> we
>>> combine the two methods and how can we combine the two methods. For
>>> instance, i only need to enter some keywords.
>>
>> Yes you are right. At the moment I offer the users a UI where the can
>> input
>> some keywords and in addition to this some rql like query via drop  
>> down
>> menus. With the help of the this semantic query, they can specify the
>> results demarcate the result set, e.g. saying that all result's  
>> should
>> belong to one class, or deal with one theme.
>>
>> Now I try automate the generation of the query... But I'm not sure  
>> how to
>> do
>> this exactly. Maybe I will use some kind of pseudo relevance  
>> feedback to
>> make some semantic analysis an the first result set.
>>
>>
>>> Why do we have to learn some SQL like language for
>>> searching in the Semantic Web.
>>
>> Maybe this paper can help you... Primary the semantic web is for  
>> agents
>> and
>> so on, not for humans. So the information has to have a structure,  
>> which
>> can
>> be exploited.
>>
>> http://www.sciam.com/article.cfm? 
>> articleID=00048144-10D2-1C70-84A9809EC588EF
>> 21
>>
>> By the way, maybe you should take a look at http:// 
>> swoogle.umbc.edu/ There
>> is also quite a big number of papers an scholar.google.
>>
>> Do your have any ideas, right now?
>>
>> Peace
>>
>> Klaus
>>
>>
>>
>>
>
>
> --
> Regards
>
> Jiang Xing


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Use the lucene for searching in the Semantic Web.

Posted by xing jiang <gi...@gmail.com>.
Hi,

I have done some surveys about the information retrieval on the Semantic
Web, (maybe i miss many papers, most papers i used are published in recent
WWW and CIKM conferences, :).

1. A typical way of using the ontology is to select exact term from the
domain ontology to form queries. The first one may be the OntoSeek (
www.loa-cnr.it/Papers/OntoSeek.pdf ).  Similar work may be Latifur Khan's
work ("Retrieval effectiveness of an ontology-based model for information
selection", VLDB 2004).

2. Guha et al. ("Semantic Search", WWW2003) used domain ontology to form a
concept graph. Then, users only need to browse the concept graph egenerated.
Similar work may be Eero Hyvonen's  work "MuseumFinland". They all used the
semantic structure of the domain ontology to help users browsing.

3. QuizRDF (www.cs.rutgers.edu/~shklar/www11/final_submissions/paper6.pdf) used
another kind of method for using the domain ontology.  Klaus, I think your
method  should be better than QuizRDF.

One interesting method i found is Roha's work ("A hybrid approach for
searching in the Semantic Web", WWW2004). They still used keyword based
method for retrieving documents on the Semantic Web. But i cannot find any
more information about their work and the application i am building can be
seen an extension of their work.

Actually, the swoogle focuses on the ontology level's files only. It will
crawls RDF, OWL & DAML files. But they do not provide any new method to
combine the traditional keyword method for searching the text files. Li Ding
used a variant of page rank method for ontology files. But i am not sure
this method can be combined with the page rank method.

Maybe i have missed too many things when i do this survey. However, I think
we may can find some good new methods of using the domain ontology in the
Semantic Web.

Yours truly,
Jiang Xing




On 1/19/06, Klaus <kl...@vommond.de> wrote:
>
> Hello,
>
>
> >Hi,
>
> >I think one problem of the existing method is that, to query on the RDF
> >files or similar structures, we have to form SQL like queries. However,
> for
> >searching in the text files, we only need to type several keywords. Can
> we
> >combine the two methods and how can we combine the two methods. For
> >instance, i only need to enter some keywords.
>
> Yes you are right. At the moment I offer the users a UI where the can
> input
> some keywords and in addition to this some rql like query via drop down
> menus. With the help of the this semantic query, they can specify the
> results demarcate the result set, e.g. saying that all result's should
> belong to one class, or deal with one theme.
>
> Now I try automate the generation of the query... But I'm not sure how to
> do
> this exactly. Maybe I will use some kind of pseudo relevance feedback to
> make some semantic analysis an the first result set.
>
>
> >Why do we have to learn some SQL like language for
> >searching in the Semantic Web.
>
> Maybe this paper can help you... Primary the semantic web is for agents
> and
> so on, not for humans. So the information has to have a structure, which
> can
> be exploited.
>
> http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF
> 21
>
> By the way, maybe you should take a look at http://swoogle.umbc.edu/ There
> is also quite a big number of papers an scholar.google.
>
> Do your have any ideas, right now?
>
> Peace
>
> Klaus
>
>
>
>


--
Regards

Jiang Xing

Re: Use the lucene for searching in the Semantic Web.

Posted by adasal <ad...@gmail.com>.
Presumably because this is the way of formulating an inductive statement.
Just entering key words doesn't introduce the notion of a relationship
between some known and some other unknown terms.

> Queries match graph patterns against the target graph of the query.


From http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/
In other words there is an homorphic relationship between the target graph
querry and its solution.
And forming a querry implies knowledge of the dataset. It is not knowledge
a=of an SQL like language that is required, but some knowledge of the data
set that the language conveys. So the solution to doing this in one step
would have to include some representation of the data set to the user, or an
automatic mapping from the arbitrary query to the data. It seems to me that
this would be difficult to accomplish as it contains the problem of how the
meta data is defined. For instance if it is an ontology but it uses the term
partner for all sorts of couples from friends to married partners. (This
might have been right when defining the ontology, for instance of use for
hospital information cards where what is meant is the person who might be
with you at the time of the, say not very serious, procedure.)
Defining synonyms after the event, or using statistical contextual analysis
for same may help, but these are the problems as I see them.
Adam

On 18/01/06, xing jiang <gi...@gmail.com> wrote:
>
> Hi,
>
> I think one problem of the existing method is that, to query on the RDF
> files or similar structures, we have to form SQL like queries. However,
> for
> searching in the text files, we only need to type several keywords. Can we
> combine the two methods and how can we combine the two methods. For
> instance, i only need to enter some keywords. Then, the system can handle
> the left process. Why do we have to learn some SQL like language for
> searching in the Semantic Web.
>
> regards
> Jiang Xing
>
>
> On 1/18/06, Klaus <kl...@vommond.de> wrote:
> >
> > Hi Jiang,
> >
> > I'm currently facing a similar problem. Up to now I have to use for the
> > semantic query a graph matching algorithm, but the fulltext search in
> the
> > semantic web is performed by lucene.
> > At first I wrote the whole text into a one index. The document contains
> > one
> > field for the unique id and on for the whole text. For the semantic
> markup
> > I
> > use an extra index. Every rdf triple will result in a document with the
> > following fields id, predicate + subject + object. Every query is
> executed
> > on both indexes. I use an extra index for the rdf data, because this
> > results
> > in a higher score for the documents. You might argue that this would
> > adulterate the result, but from me point of view explicit Meta data
> should
> > be higher scored then terms in document body.
> >
> > Cheers,
> >
> > Klaus
> >
> > -----Ursprüngliche Nachricht-----
> > Von: jason [mailto:gingerons@gmail.com]
> > Gesendet: Dienstag, 17. Januar 2006 15:35
> > An: java-user@lucene.apache.org
> > Betreff: Use the lucene for searching in the Semantic Web.
> >
> > Hi friends,
> >
> > How do you think use the lucene for searching in the Semantic Web? I am
> > trying using the lucene for searching documents with ontological
> > annotation.
> > But i do not get a better model to combine the keywords information and
> > the
> > ontological information.
> >
> > regards
> > jiang xing
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>
> --
> Regards
>
> Jiang Xing
>
>

Re: Use the lucene for searching in the Semantic Web.

Posted by xing jiang <gi...@gmail.com>.
Hi,

I think one problem of the existing method is that, to query on the RDF
files or similar structures, we have to form SQL like queries. However, for
searching in the text files, we only need to type several keywords. Can we
combine the two methods and how can we combine the two methods. For
instance, i only need to enter some keywords. Then, the system can handle
the left process. Why do we have to learn some SQL like language for
searching in the Semantic Web.

regards
Jiang Xing


On 1/18/06, Klaus <kl...@vommond.de> wrote:
>
> Hi Jiang,
>
> I'm currently facing a similar problem. Up to now I have to use for the
> semantic query a graph matching algorithm, but the fulltext search in the
> semantic web is performed by lucene.
> At first I wrote the whole text into a one index. The document contains
> one
> field for the unique id and on for the whole text. For the semantic markup
> I
> use an extra index. Every rdf triple will result in a document with the
> following fields id, predicate + subject + object. Every query is executed
> on both indexes. I use an extra index for the rdf data, because this
> results
> in a higher score for the documents. You might argue that this would
> adulterate the result, but from me point of view explicit Meta data should
> be higher scored then terms in document body.
>
> Cheers,
>
> Klaus
>
> -----Ursprüngliche Nachricht-----
> Von: jason [mailto:gingerons@gmail.com]
> Gesendet: Dienstag, 17. Januar 2006 15:35
> An: java-user@lucene.apache.org
> Betreff: Use the lucene for searching in the Semantic Web.
>
> Hi friends,
>
> How do you think use the lucene for searching in the Semantic Web? I am
> trying using the lucene for searching documents with ontological
> annotation.
> But i do not get a better model to combine the keywords information and
> the
> ontological information.
>
> regards
> jiang xing
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


--
Regards

Jiang Xing

AW: Use the lucene for searching in the Semantic Web.

Posted by Klaus <kl...@vommond.de>.
Hi Jiang,

I'm currently facing a similar problem. Up to now I have to use for the
semantic query a graph matching algorithm, but the fulltext search in the
semantic web is performed by lucene. 
At first I wrote the whole text into a one index. The document contains one
field for the unique id and on for the whole text. For the semantic markup I
use an extra index. Every rdf triple will result in a document with the
following fields id, predicate + subject + object. Every query is executed
on both indexes. I use an extra index for the rdf data, because this results
in a higher score for the documents. You might argue that this would
adulterate the result, but from me point of view explicit Meta data should
be higher scored then terms in document body. 

Cheers,

Klaus

-----Ursprüngliche Nachricht-----
Von: jason [mailto:gingerons@gmail.com] 
Gesendet: Dienstag, 17. Januar 2006 15:35
An: java-user@lucene.apache.org
Betreff: Use the lucene for searching in the Semantic Web.

Hi friends,

How do you think use the lucene for searching in the Semantic Web? I am
trying using the lucene for searching documents with ontological annotation.
But i do not get a better model to combine the keywords information and the
ontological information.

regards
jiang xing


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org