You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Nitasha Walia (niwalia)" <ni...@cisco.com> on 2008/04/02 20:26:45 UTC

Adding attribute to index

Hi, 
 
I am a new user of Java Lucene and need to learn how to add a new
attribute, such that, given a database of emails, containing sender
information, searching for a keyword, results in 
1. The sender of the email
2. The email. 
 
I am using Lucene-2.3.1, and don't know where to start in the huge code
base. 
 
Can someone please advise on the same?
 
Thanks,
 	
Nitasha Walia
Software Engineer
Product Development

niwalia@cisco.com
Mobile: 412-736 4507




United States
Cisco home page <http://www.cisco.com/> 



 	
 	
  Think before you print.	
This e-mail may contain confidential and privileged material for the
sole use of the intended recipient. Any review, use, distribution or
disclosure by others is strictly prohibited. If you are not the intended
recipient (or authorized to receive for the recipient), please contact
the sender by reply e-mail and delete all copies of this message.	
 


 

Re: Adding attribute to index

Posted by Michael Wechner <mi...@wyona.com>.
Nitasha Walia (niwalia) wrote:

> Hi,
>  
> I am a new user of Java Lucene and need to learn how to add a new 
> attribute, such that, given a database of emails, containing sender 
> information, searching for a keyword, results in


what kind of database do you use to store your emails?

I am asking because it might make sense to introduce some data 
abstraction layer (for example JCR or Yarep) which would access your 
database and has built-in Lucene and hence you would't have to worry 
about Lucene itself, but could rather search like

Node[] emails = getRepository("emails").search("sender", QUERY);
for (i < emails.length) System.out.print(emails[i].getProperty("body");

> 1. The sender of the email
> 2. The email.


Otherwise I would suggest to start at

http://lucene.apache.org/java/2_3_1/gettingstarted.html

HTH

Michael

>  
> I am using Lucene-2.3.1, and don't know where to start in the huge 
> code base.
>  
> Can someone please advise on the same?
>  
> Thanks,
>
> *Nitasha Walia*
> *Software Engineer*
> **Product Development*
> *
> niwalia@cisco.com <ma...@cisco.com>
> Mobile: *412-736 4507*
>
> 	
>
> **
>
> United States
> Cisco home page <http://www.cisco.com/>
>
> 	 
>
> Think before you print. Think before you print.
> This e-mail may contain confidential and privileged material for the 
> sole use of the intended recipient. Any review, use, distribution or 
> disclosure by others is strictly prohibited. If you are not the 
> intended recipient (or authorized to receive for the recipient), 
> please contact the sender by reply e-mail and delete all copies of 
> this message.
>
>
>  



-- 
Michael Wechner
Wyona      -   Open Source Content Management - Yanel, Yulup
http://www.wyona.com
michael.wechner@wyona.com, michi@apache.org
+41 44 272 91 61


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Adding attribute to index

Posted by "Nitasha Walia (niwalia)" <ni...@cisco.com>.
Thanks !! 

-----Original Message-----
From: Donna L Gresh [mailto:gresh@us.ibm.com] 
Sent: Wednesday, April 02, 2008 11:52 AM
To: java-user@lucene.apache.org
Subject: Re: Adding attribute to index

This is "fast and loose" code (from my head; check the syntax). I
*highly* recommend you get a copy of the book Lucene in Action; it will
really help.

To create the index, add a document with two fields; one for the sender
and one for the email text.

IndexWriter indexWriter = new IndexWriter(...... 

Document emailDoc = new Document();
Field senderField = new Field("sender", senderEmailAddress,
Field.Store.YES, Field.Index.UN_TOKENIZED); emailDoc.add(senderField);
Field textField = new Field("emailText", textOfEmail, Field.Store.YES,
Field.Index.TOKENIZED); emailDoc.add(textField);
indexWriter.addDocument(emailDoc);


Then when you are searching, search in the email text field:

Query query = new TermQuery(new Term("emailText","searchTerm")); Hits
hits = searcher.search(query); Document doc = hits.doc(0); //best fit
document String emailSender = doc.get("sender"); String emailText =
doc.get("emailText");


Donna L. Gresh
Services Research, Mathematical Sciences Department IBM T.J. Watson
Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
gresh@us.ibm.com


"Nitasha Walia (niwalia)" <ni...@cisco.com> wrote on 04/02/2008
02:26:45
PM:

> Hi,
> 
> I am a new user of Java Lucene and need to learn how to add a new 
> attribute, such that, given a database of emails, containing sender 
> information, searching for a keyword, results in 1. The sender of the 
> email 2. The email.
> 
> I am using Lucene-2.3.1, and don't know where to start in the huge 
> code
base. 
> 
> Can someone please advise on the same?
> 
> Thanks,
> 
> [image removed]
> 
> Nitasha Walia
> Software Engineer
> Product Development
> 
> niwalia@cisco.com
> Mobile: 412-736 4507
> 
> 
> 
> United States
> Cisco home page

> 
> 
> 
> [image removed]
> 
> [image removed] Think before you print.
> 
> This e-mail may contain confidential and privileged material for the 
> sole use of the intended recipient. Any review, use, distribution or 
> disclosure by others is strictly prohibited. If you are not the 
> intended recipient (or authorized to receive for the recipient), 
> please contact the sender by reply e-mail and delete all copies of 
> this message.
> 
> [image removed]
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Adding attribute to index

Posted by Donna L Gresh <gr...@us.ibm.com>.
This is "fast and loose" code (from my head; check the syntax). I *highly* 
recommend you get a copy of the book Lucene in Action; it will really 
help.

To create the index, add a document with two fields; one for the sender 
and one for the email text.

IndexWriter indexWriter = new IndexWriter(...... 

Document emailDoc = new Document();
Field senderField = new Field("sender", senderEmailAddress, 
Field.Store.YES, Field.Index.UN_TOKENIZED);
emailDoc.add(senderField);
Field textField = new Field("emailText", textOfEmail, Field.Store.YES, 
Field.Index.TOKENIZED);
emailDoc.add(textField);
indexWriter.addDocument(emailDoc);


Then when you are searching, search in the email text field:

Query query = new TermQuery(new Term("emailText","searchTerm"));
Hits hits = searcher.search(query);
Document doc = hits.doc(0); //best fit document
String emailSender = doc.get("sender");
String emailText = doc.get("emailText");


Donna L. Gresh
Services Research, Mathematical Sciences Department
IBM T.J. Watson Research Center
(914) 945-2472
http://www.research.ibm.com/people/g/donnagresh
gresh@us.ibm.com


"Nitasha Walia (niwalia)" <ni...@cisco.com> wrote on 04/02/2008 02:26:45 
PM:

> Hi, 
> 
> I am a new user of Java Lucene and need to learn how to add a new 
> attribute, such that, given a database of emails, containing sender 
> information, searching for a keyword, results in 
> 1. The sender of the email
> 2. The email. 
> 
> I am using Lucene-2.3.1, and don't know where to start in the huge code 
base. 
> 
> Can someone please advise on the same?
> 
> Thanks,
> 
> [image removed] 
> 
> Nitasha Walia
> Software Engineer
> Product Development
> 
> niwalia@cisco.com
> Mobile: 412-736 4507
> 
> 
> 
> United States
> Cisco home page

> 
> 
> 
> [image removed] 
> 
> [image removed] Think before you print.
> 
> This e-mail may contain confidential and privileged material for the
> sole use of the intended recipient. Any review, use, distribution or
> disclosure by others is strictly prohibited. If you are not the 
> intended recipient (or authorized to receive for the recipient), 
> please contact the sender by reply e-mail and delete all copies of 
> this message.
> 
> [image removed] 
> 
>