You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by bhecht <bh...@ams-sys.com> on 2007/05/21 20:59:14 UTC

Implement a tokenizer

Hi there,

I was interested in changing the StandardTokenzier so it will not remove the
"+" (plus) sign from my stream.
Looking in the code and documentation, it reads: 

"If this tokenizer does not suit your application, please consider copying
this source code
directory to your project and maintaining your own grammar-based tokenizer."

I can't understand from this code where I should jump in, and do my change.
Can someone point me out to where I should look at in order perform my
change?

Thanks in advanced
-- 
View this message in context: http://www.nabble.com/Implement-a-tokenizer-tf3792172.html#a10724827
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Implement a tokenizer

Posted by "Mordo, Aviran (EXP N-NANNATEK)" <av...@lmco.com>.
What you need to do is to create your own tokenizer. Just copy the code
from the StandardTokenizer to your XYZTokenizer and make your changes.
Then you need to create your own Analyzer class (again copy the code
from the StandardAnalyzer) and user your XYZTokenizer in the new
XYZAnalyzer you created.

HTH

Aviran
http://www.aviransplace.com
http://shaveh.co.il 

-----Original Message-----
From: bhecht [mailto:bhecht@ams-sys.com] 
Sent: Monday, May 21, 2007 2:59 PM
To: java-user@lucene.apache.org
Subject: Implement a tokenizer


Hi there,

I was interested in changing the StandardTokenzier so it will not remove
the "+" (plus) sign from my stream.
Looking in the code and documentation, it reads: 

"If this tokenizer does not suit your application, please consider
copying this source code directory to your project and maintaining your
own grammar-based tokenizer."

I can't understand from this code where I should jump in, and do my
change.
Can someone point me out to where I should look at in order perform my
change?

Thanks in advanced
--
View this message in context:
http://www.nabble.com/Implement-a-tokenizer-tf3792172.html#a10724827
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Implement a tokenizer

Posted by Chris Lu <ch...@gmail.com>.
Actually before you jump in, be warned that the "+" plus sign is also
part of query parser.
You can not really/easily pass the query with the "+" sign through
query parser in order to get a match.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes

On 5/21/07, bhecht <bh...@ams-sys.com> wrote:
>
> Hi there,
>
> I was interested in changing the StandardTokenzier so it will not remove the
> "+" (plus) sign from my stream.
> Looking in the code and documentation, it reads:
>
> "If this tokenizer does not suit your application, please consider copying
> this source code
> directory to your project and maintaining your own grammar-based tokenizer."
>
> I can't understand from this code where I should jump in, and do my change.
> Can someone point me out to where I should look at in order perform my
> change?
>
> Thanks in advanced
> --
> View this message in context: http://www.nabble.com/Implement-a-tokenizer-tf3792172.html#a10724827
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org