You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by mik07 <so...@yahoo.de> on 2008/05/14 16:30:52 UTC

Online Question Answering demo using Lucene

[Apologies if you consider this as spam]

Hello Lucene users and developers,

I wanted to point people on this list to a Question Answering System,
developed at the University of Edinburgh, which uses Lucene to index
Wikipedia. I though some of you might be interested in this particular use
of Lucene. Since a few month we have an online demo. It can be found here:

http://demos.inf.ed.ac.uk:8080/qualim/search?text=true&query=How+many+Munros+are+there+in+Scotland%3F

A *DRAFT* of a short paper describing a few of the features can be found
here:
http://homepages.inf.ed.ac.uk/s0570760/publications/Kaisser_ACL_2008_Demo.pdf

I would be glad if you could take a look and I also appreciate any comments!

Best Regards,
Michael

PS: We use quite a few other tools beside Lucene in this demo. Initially we
find answers be querying major search engines (Yahoo and Google) and use
parsers (LinkParser and MiniPar) to post process their result. We employ a
Named Entity Recognition System (GATE's ANNIE) to look for promising answer
strings. Finally, Lucene is used for Wikipedia paragraph retrieval.  





-- 
View this message in context: http://www.nabble.com/Online-Question-Answering-demo-using-Lucene-tp17232494p17232494.html
Sent from the Lucene - General mailing list archive at Nabble.com.


Re: Online Question Answering demo using Lucene

Posted by mik07 <so...@yahoo.de>.

Hi David,

nice to hear from you! Bonnie's fine and yes, it's basically the same system
as used in TREC. With the exception of the Wikipedia indexing, that was
especially done for the Web demo.

Cheers,
Michael


Dave Kor wrote:
> 
> Hi Michael,
> 
> It's nice to see you opening up the stuff that has been development at
> Edinburgh. How's Bonnie?
> 
> Is QuALim similar to the system you used in TREC?
> 
> 
> On Wed, May 14, 2008 at 10:30 PM, mik07 <so...@yahoo.de> wrote:
> 
>>
>> [Apologies if you consider this as spam]
>>
>> Hello Lucene users and developers,
>>
>> I wanted to point people on this list to a Question Answering System,
>> developed at the University of Edinburgh, which uses Lucene to index
>> Wikipedia. I though some of you might be interested in this particular
>> use
>> of Lucene. Since a few month we have an online demo. It can be found
>> here:
>>
>>
>> http://demos.inf.ed.ac.uk:8080/qualim/search?text=true&query=How+many+Munros+are+there+in+Scotland%3F
>>
>> A *DRAFT* of a short paper describing a few of the features can be found
>> here:
>>
>> http://homepages.inf.ed.ac.uk/s0570760/publications/Kaisser_ACL_2008_Demo.pdf
>>
>> I would be glad if you could take a look and I also appreciate any
>> comments!
>>
>> Best Regards,
>> Michael
>>
>> PS: We use quite a few other tools beside Lucene in this demo. Initially
>> we
>> find answers be querying major search engines (Yahoo and Google) and use
>> parsers (LinkParser and MiniPar) to post process their result. We employ
>> a
>> Named Entity Recognition System (GATE's ANNIE) to look for promising
>> answer
>> strings. Finally, Lucene is used for Wikipedia paragraph retrieval.
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Online-Question-Answering-demo-using-Lucene-tp17232494p17232494.html
>> Sent from the Lucene - General mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Regards,
> Dave Kor
> 
> 

-- 
View this message in context: http://www.nabble.com/Online-Question-Answering-demo-using-Lucene-tp17232494p17256909.html
Sent from the Lucene - General mailing list archive at Nabble.com.


Re: Online Question Answering demo using Lucene

Posted by Dave Kor <da...@gmail.com>.
Hi Michael,

It's nice to see you opening up the stuff that has been development at
Edinburgh. How's Bonnie?

Is QuALim similar to the system you used in TREC?


On Wed, May 14, 2008 at 10:30 PM, mik07 <so...@yahoo.de> wrote:

>
> [Apologies if you consider this as spam]
>
> Hello Lucene users and developers,
>
> I wanted to point people on this list to a Question Answering System,
> developed at the University of Edinburgh, which uses Lucene to index
> Wikipedia. I though some of you might be interested in this particular use
> of Lucene. Since a few month we have an online demo. It can be found here:
>
>
> http://demos.inf.ed.ac.uk:8080/qualim/search?text=true&query=How+many+Munros+are+there+in+Scotland%3F
>
> A *DRAFT* of a short paper describing a few of the features can be found
> here:
>
> http://homepages.inf.ed.ac.uk/s0570760/publications/Kaisser_ACL_2008_Demo.pdf
>
> I would be glad if you could take a look and I also appreciate any
> comments!
>
> Best Regards,
> Michael
>
> PS: We use quite a few other tools beside Lucene in this demo. Initially we
> find answers be querying major search engines (Yahoo and Google) and use
> parsers (LinkParser and MiniPar) to post process their result. We employ a
> Named Entity Recognition System (GATE's ANNIE) to look for promising answer
> strings. Finally, Lucene is used for Wikipedia paragraph retrieval.
>
>
>
>
>
> --
> View this message in context:
> http://www.nabble.com/Online-Question-Answering-demo-using-Lucene-tp17232494p17232494.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>
>


-- 
Regards,
Dave Kor

Re: Online Question Answering demo using Lucene

Posted by mik07 <so...@yahoo.de>.
Thanks! And you are right, it's roughly the same as Powerset.

It's slower because:
* The demo runs on a single machine (not on a cluster).
* We need to query search engines through their API, which have a 1 second
build-in delay per query.
* We parse sentences once we retrieve them from the search engines and
parsers are still rather slow. Powerset on the other hand, parses Wikipedia
before indexing and indexes the semantic structures. So no parsing needs to
be performed when a user asks a query (beside the parsing of that query, I
suppose.)
* The Lucene index of the complete English Wikipedia we built is 8.3 GB big.
On our machine it takes 2 seconds per query to get a result.

You could address these issues with enough money and man power. But it's
just a research project, developed by one person. We don't have the
resources. (Please drop me an email if you have some ;-)

Cheers,
Michael
-- 
View this message in context: http://www.nabble.com/Online-Question-Answering-demo-using-Lucene-tp17232494p17236078.html
Sent from the Lucene - General mailing list archive at Nabble.com.


Re: Online Question Answering demo using Lucene

Posted by "J. Delgado" <jd...@lendingclub.com>.
This is great. Comparable to www.powerset.com (though much slower).

I suggest you also index Freebase the structured version of Wikipedia.

J.D.

On Wed, May 14, 2008 at 7:30 AM, mik07 <so...@yahoo.de> wrote:

>
> [Apologies if you consider this as spam]
>
> Hello Lucene users and developers,
>
> I wanted to point people on this list to a Question Answering System,
> developed at the University of Edinburgh, which uses Lucene to index
> Wikipedia. I though some of you might be interested in this particular use
> of Lucene. Since a few month we have an online demo. It can be found here:
>
>
> http://demos.inf.ed.ac.uk:8080/qualim/search?text=true&query=How+many+Munros+are+there+in+Scotland%3F
>
> A *DRAFT* of a short paper describing a few of the features can be found
> here:
>
> http://homepages.inf.ed.ac.uk/s0570760/publications/Kaisser_ACL_2008_Demo.pdf
>
> I would be glad if you could take a look and I also appreciate any
> comments!
>
> Best Regards,
> Michael
>
> PS: We use quite a few other tools beside Lucene in this demo. Initially we
> find answers be querying major search engines (Yahoo and Google) and use
> parsers (LinkParser and MiniPar) to post process their result. We employ a
> Named Entity Recognition System (GATE's ANNIE) to look for promising answer
> strings. Finally, Lucene is used for Wikipedia paragraph retrieval.
>
>
>
>
>
> --
> View this message in context:
> http://www.nabble.com/Online-Question-Answering-demo-using-Lucene-tp17232494p17232494.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>
>