You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by maxSchlein <m_...@hotmail.com> on 2010/02/15 22:28:04 UTC

Controlling what is indexed / normalizing our index

We have a list of keywords with aliases (Example:  keyword = "ms access"
aliases = "microsoft access", "msaccess", "m.s. access"  )

We would like to intercept the aliases prior to them being indexed, and have
the keyword indexed instead.  We can do this with a CustomFilter for single
word aliases.  (Example: in filter token = "access", we change value to
"msaccess").  Our problem is when the token equals microsoft, we need to
find out if the next token is access or not, that is, does it match one of
our aliases.

Has anyone had an issue like?  Any and all help is appreciated.  Thanx.
-- 
View this message in context: http://old.nabble.com/Controlling-what-is-indexed----normalizing-our-index-tp27600274p27600274.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Controlling what is indexed / normalizing our index

Posted by Ahmet Arslan <io...@yahoo.com>.
> We have a list of keywords with aliases (Example: 
> keyword = "ms access"
> aliases = "microsoft access", "msaccess", "m.s.
> access"  )
> 
> We would like to intercept the aliases prior to them being
> indexed, and have
> the keyword indexed instead.  We can do this with a
> CustomFilter for single
> word aliases.  (Example: in filter token = "access",
> we change value to
> "msaccess").  Our problem is when the token equals
> microsoft, we need to
> find out if the next token is access or not, that is, does
> it match one of
> our aliases.
> 
> Has anyone had an issue like?  Any and all help is
> appreciated.  Thanx.

Very similar discussion has been made short time ago:

http://www.lucidimagination.com/search/document/6a2725ae6de611b3/read_more_tokens_during_analysis#ded44745457549ed


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org