You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by thanh nguyen <ng...@yahoo.com.vn> on 2006/04/21 17:36:20 UTC
Lucene, TREC, and WT10G
Hi all,
Did anyone use Lucene to index WT10G? Can it index
WT10G in compressed format (.gz) or we have to unzip
it first?
Further more, does Lucene support TREC format? I mean
can it receive a topic file like "<TOP> <NUM> 1
<TITLE> abc def </TOP>" and produce a results file
which we can use with trec_eval program?
Any help will be appretiated,
Thanh
________________________________________________________
Bạn có sử dụng Yahoo! không?
Hãy xem thử trang chủ Yahoo! Việt Nam!
http://vn.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene, TREC, and WT10G
Posted by Grant Ingersoll <gs...@syr.edu>.
It is up to you to create a program to do this, but it is relatively
easy. You may want to search the web, chances are someone has posted
code to do this, as a number of people have used Lucene in TREC in the past.
Good luck,
Grant
thanh nguyen wrote:
> Hi trupti,
>
> Thank for your response. I have another question.
> Whether Lucene can receive a topic file like "<TOP>
> <NUM> 1 <TITLE> abc def </TOP>" and produce a
> result_file which we can use with trec_eval program
> (trec_eval relevant_file result_file , relevant_file
> is the judgement file of TREC for these topic) ??
>
> Thank you in advance,
> Thanh.
>
>
> --- trupti mulajkar <ac...@sheffield.ac.uk> đã
> viết:
>
>
>> Lucene can index the trec documents, but depends how
>> you want to index them.
>> If you want to index the sub files in the TREC DAtA
>> then you have to modify the
>> IndexFiles.java to read the tags else you can index
>> them normally.
>>
>> cheers,
>> trupti mulajkar
>>
>> Quoting thanh nguyen <ng...@yahoo.com.vn>:
>>
>>
>>> Hi all,
>>>
>>> Did anyone use Lucene to index WT10G? Can it index
>>> WT10G in compressed format (.gz) or we have to
>>>
>> unzip
>>
>>> it first?
>>>
>>> Further more, does Lucene support TREC format? I
>>>
>> mean
>>
>>> can it receive a topic file like "<TOP> <NUM> 1
>>> <TITLE> abc def </TOP>" and produce a results
>>>
>> file
>>
>>> which we can use with trec_eval program?
>>>
>>> Any help will be appretiated,
>>> Thanh
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
> ________________________________________________________
>
>>> Bạn có sỠdụng Yahoo! không?
>>> Hãy xem thỠtrang chủ Yahoo!
>>>
>> Viá»t Nam!
>>
>>> http://vn.yahoo.com
>>>
>>>
>>>
> ---------------------------------------------------------------------
>
>>> To unsubscribe, e-mail:
>>>
>> java-user-unsubscribe@lucene.apache.org
>>
>>> For additional commands, e-mail:
>>>
>> java-user-help@lucene.apache.org
>>
>>>
>>
> ---------------------------------------------------------------------
>
>> To unsubscribe, e-mail:
>> java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail:
>> java-user-help@lucene.apache.org
>>
>>
>>
>
>
> __________________________________________________
> Bạn Có Sử Dụng Yahoo! Không?
> Mệt mỏi vì thư rác? Yahoo! Thư có chương trình bảo vệ chống thư rác hữu hiệu nhất trên mạng
> http://vn.mail.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
--
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
Voice: 315-443-5484
Fax: 315-443-6886
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene, TREC, and WT10G
Posted by thanh nguyen <ng...@yahoo.com.vn>.
Hi trupti,
Thank for your response. I have another question.
Whether Lucene can receive a topic file like "<TOP>
<NUM> 1 <TITLE> abc def </TOP>" and produce a
result_file which we can use with trec_eval program
(trec_eval relevant_file result_file , relevant_file
is the judgement file of TREC for these topic) ??
Thank you in advance,
Thanh.
--- trupti mulajkar <ac...@sheffield.ac.uk> đã
viết:
> Lucene can index the trec documents, but depends how
> you want to index them.
> If you want to index the sub files in the TREC DAtA
> then you have to modify the
> IndexFiles.java to read the tags else you can index
> them normally.
>
> cheers,
> trupti mulajkar
>
> Quoting thanh nguyen <ng...@yahoo.com.vn>:
>
> > Hi all,
> >
> > Did anyone use Lucene to index WT10G? Can it index
> > WT10G in compressed format (.gz) or we have to
> unzip
> > it first?
> >
> > Further more, does Lucene support TREC format? I
> mean
> > can it receive a topic file like "<TOP> <NUM> 1
> > <TITLE> abc def </TOP>" and produce a results
> file
> > which we can use with trec_eval program?
> >
> > Any help will be appretiated,
> > Thanh
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
________________________________________________________
>
> > Bạn có sỠdụng Yahoo! không?
> > Hãy xem thỠtrang chủ Yahoo!
> Viá»t Nam!
> > http://vn.yahoo.com
> >
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> >
> >
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail:
> java-user-help@lucene.apache.org
>
>
__________________________________________________
Bạn Có Sử Dụng Yahoo! Không?
Mệt mỏi vì thư rác? Yahoo! Thư có chương trình bảo vệ chống thư rác hữu hiệu nhất trên mạng
http://vn.mail.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene, TREC, and WT10G
Posted by trupti mulajkar <ac...@sheffield.ac.uk>.
Lucene can index the trec documents, but depends how you want to index them.
If you want to index the sub files in the TREC DAtA then you have to modify the
IndexFiles.java to read the tags else you can index them normally.
cheers,
trupti mulajkar
Quoting thanh nguyen <ng...@yahoo.com.vn>:
> Hi all,
>
> Did anyone use Lucene to index WT10G? Can it index
> WT10G in compressed format (.gz) or we have to unzip
> it first?
>
> Further more, does Lucene support TREC format? I mean
> can it receive a topic file like "<TOP> <NUM> 1
> <TITLE> abc def </TOP>" and produce a results file
> which we can use with trec_eval program?
>
> Any help will be appretiated,
> Thanh
>
>
>
>
>
>
>
>
> ________________________________________________________
> Bạn có sỠdụng Yahoo! không?
> Hãy xem thá» trang chủ Yahoo! Viá»t Nam!
> http://vn.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org