You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Murat Yakici <Mu...@cis.strath.ac.uk> on 2009/05/29 22:39:53 UTC

Re: [VOTE] Make the Open Relevance Project (ORP) and official Lucene subproject

I hope it's not too late to vote and my vote counts.
+1

Some comments on the collection side. Why not use US/EU patent collection?
I guess it is freely available, or am I wrong? Or at least it could be
licensed with  a less restrictive licence from some place??? It is not the
biggest but may be a good one to have.

Some reasons to have such collection (if can be acquired) which might
spark some lights in your head:


1) Technical-> Content statistics are completely different than any other
collections, term distributions etc. May require specific parsers,
tokenizer implementations.

2) Multi-language content (from national patents offices)

3) It's got socio-economic benefits both for the enterprise and
inventors/creators/lawyers etc. If inventors can find more relevant
documents, the better they can prepare their patent app etc. etc. Not to
mention the patent offices, patent attorneys. Lucrative ;)

4) It's not hard to find expert judgements and maintain a user group which
could really focus and give devotion to generate relevance judgements
(compared to a nonsense, old news collection).

Cheers,

Murat Yakici
Department of Computer & Information Sciences
University of Strathclyde
Glasgow, UK
-------------------------------------------
The University of Strathclyde is a charitable body, registered in Scotland,
with registration number SC015263.


> I'd like to call a vote on adding the ORP as an official Lucene
> subproject per the proposal at
> http://wiki.apache.org/lucene-java/OpenRelevance
>   with the committers specified on the Wiki page.
>
> [] +1 - Yes, I love it
> [] 0 - I don't care
> [] -1 - I don't love it
>
> Thanks,
> Grant
>