You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Madhu Satyanarayana Panitini <Ma...@pass-consulting.com> on 2005/09/15 09:46:34 UTC

document attributes

 
Hi all
I have already posted same message to java-user group yesterday, no
response yet, please help me out of problem
I have text docs similar to the TREC format some think like this
 
<file>
<author>xxxxxxxx</author>
<publisher>yyyyyyyy</publisher>
<text>
Full text search with one or more keywords with advanced search
operators to enhance search has to be implemented. Advanced search with
document attributes like author, title, type and Meta keyword in
addition to the search term should be implemented. </text> </file>
 
with this type of doc I can index text between "<text>" tags, but I want
to preserve the data in the other tags as well like
<author>xxxxxxxx</author> <publisher>yyyyyyyy</publisher>.
 
Because the user when searching for docs he will use author as attribute
for searching Doc,
 
Is there any possibility to index the author and publisher data and use
in the search.  please tell me how can I index and use in search.
 
And also please guide me with methods or references in IR  that use
attributes of DOC for search purpose.
 
Thanks in advance
Madhu
 
 
 
 
Madhu Satyanarayana. Panitini
PASS GCA Solution Centre Pvt Ltd.
601 Aditya Trade Centre, Ameerpet, 
Hyderabad, India. 
www.pass-consulting.com 


 

Re: document attributes

Posted by Otis Gospodnetic <ot...@yahoo.com>.
This should get you started:
  http://www-128.ibm.com/developerworks/java/library/j-lucene/

Also check Chapter 7 (3rd hit) here for working code:
  http://www.lucenebook.com/search?query=indexing+xml

Otis

--- Madhu Satyanarayana Panitini <Ma...@pass-consulting.com>
wrote:

>  
> Hi all
> I have already posted same message to java-user group yesterday, no
> response yet, please help me out of problem
> I have text docs similar to the TREC format some think like this
>  
> <file>
> <author>xxxxxxxx</author>
> <publisher>yyyyyyyy</publisher>
> <text>
> Full text search with one or more keywords with advanced search
> operators to enhance search has to be implemented. Advanced search
> with
> document attributes like author, title, type and Meta keyword in
> addition to the search term should be implemented. </text> </file>
>  
> with this type of doc I can index text between "<text>" tags, but I
> want
> to preserve the data in the other tags as well like
> <author>xxxxxxxx</author> <publisher>yyyyyyyy</publisher>.
>  
> Because the user when searching for docs he will use author as
> attribute
> for searching Doc,
>  
> Is there any possibility to index the author and publisher data and
> use
> in the search.  please tell me how can I index and use in search.
>  
> And also please guide me with methods or references in IR  that use
> attributes of DOC for search purpose.
>  
> Thanks in advance
> Madhu
>  
>  
>  
>  
> Madhu Satyanarayana. Panitini
> PASS GCA Solution Centre Pvt Ltd.
> 601 Aditya Trade Centre, Ameerpet, 
> Hyderabad, India. 
> www.pass-consulting.com 
> 
> 
>  
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org