You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by thanh nguyen <ng...@yahoo.com.vn> on 2006/04/21 17:36:20 UTC

Lucene, TREC, and WT10G

Hi all,

Did anyone use Lucene to index WT10G? Can it index
WT10G in compressed format (.gz) or we have to unzip
it first?

Further more, does Lucene support TREC format? I mean
can it receive a topic file like "<TOP> <NUM> 1
<TITLE>  abc def </TOP>" and produce a results file
which we can use  with trec_eval program?

Any help will be appretiated,
Thanh



	


	
		
________________________________________________________ 
Bạn có sử dụng Yahoo! không? 
Hãy xem thử trang chủ Yahoo! Việt Nam! 
http://vn.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene, TREC, and WT10G

Posted by Grant Ingersoll <gs...@syr.edu>.
It is up to you to create a program to do this, but it is relatively 
easy.  You may want to search the web, chances are someone has posted 
code to do this, as a number of people have used Lucene in TREC in the past.

Good luck,
Grant

thanh nguyen wrote:
> Hi trupti,
>
> Thank for your response. I have another question.
> Whether Lucene can receive a topic file  like "<TOP>
> <NUM> 1 <TITLE>  abc def </TOP>" and produce a
> result_file  which we can use  with trec_eval program
> (trec_eval relevant_file result_file , relevant_file
> is the judgement file of TREC for these topic) ??
>
> Thank you in advance,
> Thanh.
>
>
> --- trupti mulajkar <ac...@sheffield.ac.uk> đã
> viết:  
>
>   
>> Lucene can index the trec documents, but depends how
>> you want to index them.
>> If you want to index the sub files in the TREC DAtA
>> then you have to modify the
>> IndexFiles.java to read the tags else you can index
>> them normally.
>>  
>> cheers,
>> trupti mulajkar
>>
>> Quoting thanh nguyen <ng...@yahoo.com.vn>:
>>
>>     
>>> Hi all,
>>>
>>> Did anyone use Lucene to index WT10G? Can it index
>>> WT10G in compressed format (.gz) or we have to
>>>       
>> unzip
>>     
>>> it first?
>>>
>>> Further more, does Lucene support TREC format? I
>>>       
>> mean
>>     
>>> can it receive a topic file like "<TOP> <NUM> 1
>>> <TITLE>  abc def </TOP>" and produce a results
>>>       
>> file
>>     
>>> which we can use  with trec_eval program?
>>>
>>> Any help will be appretiated,
>>> Thanh
>>>
>>>
>>>
>>> 	
>>>
>>>
>>> 	
>>> 		
>>>
>>>       
> ________________________________________________________
>   
>>> Bạn có sử dụng Yahoo! không? 
>>> Hãy xem thử trang chủ Yahoo!
>>>       
>> Việt Nam! 
>>     
>>> http://vn.yahoo.com
>>>
>>>
>>>       
> ---------------------------------------------------------------------
>   
>>> To unsubscribe, e-mail:
>>>       
>> java-user-unsubscribe@lucene.apache.org
>>     
>>> For additional commands, e-mail:
>>>       
>> java-user-help@lucene.apache.org
>>     
>>>       
>>     
> ---------------------------------------------------------------------
>   
>> To unsubscribe, e-mail:
>> java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail:
>> java-user-help@lucene.apache.org
>>
>>
>>     
>
>
> __________________________________________________
> Bạn Có Sử Dụng Yahoo! Không?
> Mệt mỏi vì thư rác?  Yahoo! Thư có chương trình bảo vệ chống thư rác hữu hiệu nhất trên mạng 
> http://vn.mail.yahoo.com 
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene, TREC, and WT10G

Posted by thanh nguyen <ng...@yahoo.com.vn>.
Hi trupti,

Thank for your response. I have another question.
Whether Lucene can receive a topic file  like "<TOP>
<NUM> 1 <TITLE>  abc def </TOP>" and produce a
result_file  which we can use  with trec_eval program
(trec_eval relevant_file result_file , relevant_file
is the judgement file of TREC for these topic) ??

Thank you in advance,
Thanh.


--- trupti mulajkar <ac...@sheffield.ac.uk> đã
viết:  

> Lucene can index the trec documents, but depends how
> you want to index them.
> If you want to index the sub files in the TREC DAtA
> then you have to modify the
> IndexFiles.java to read the tags else you can index
> them normally.
>  
> cheers,
> trupti mulajkar
> 
> Quoting thanh nguyen <ng...@yahoo.com.vn>:
> 
> > Hi all,
> > 
> > Did anyone use Lucene to index WT10G? Can it index
> > WT10G in compressed format (.gz) or we have to
> unzip
> > it first?
> > 
> > Further more, does Lucene support TREC format? I
> mean
> > can it receive a topic file like "<TOP> <NUM> 1
> > <TITLE>  abc def </TOP>" and produce a results
> file
> > which we can use  with trec_eval program?
> > 
> > Any help will be appretiated,
> > Thanh
> > 
> > 
> > 
> > 	
> > 
> > 
> > 	
> > 		
> >
>
________________________________________________________
> 
> > Bạn có sử dụng Yahoo! không? 
> > Hãy xem thử trang chủ Yahoo!
> Việt Nam! 
> > http://vn.yahoo.com
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > 
> > 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> 
> 


__________________________________________________
Bạn Có Sử Dụng Yahoo! Không?
Mệt mỏi vì thư rác?  Yahoo! Thư có chương trình bảo vệ chống thư rác hữu hiệu nhất trên mạng 
http://vn.mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene, TREC, and WT10G

Posted by trupti mulajkar <ac...@sheffield.ac.uk>.
Lucene can index the trec documents, but depends how you want to index them.
If you want to index the sub files in the TREC DAtA then you have to modify the
IndexFiles.java to read the tags else you can index them normally.
 
cheers,
trupti mulajkar

Quoting thanh nguyen <ng...@yahoo.com.vn>:

> Hi all,
> 
> Did anyone use Lucene to index WT10G? Can it index
> WT10G in compressed format (.gz) or we have to unzip
> it first?
> 
> Further more, does Lucene support TREC format? I mean
> can it receive a topic file like "<TOP> <NUM> 1
> <TITLE>  abc def </TOP>" and produce a results file
> which we can use  with trec_eval program?
> 
> Any help will be appretiated,
> Thanh
> 
> 
> 
> 	
> 
> 
> 	
> 		
> ________________________________________________________ 
> Bạn có sử dụng Yahoo! không? 
> Hãy xem thử trang chủ Yahoo! Việt Nam! 
> http://vn.yahoo.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org