You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Flavio Eduardo de Cordova <fl...@datasul.com.br> on 2003/07/08 01:26:16 UTC

Problems with StandardTokenizer

People...

	I've created a custom analyser that uses the StandardTokenizer class
to get the tokens from the reader.
	It seemed to work fine but I just noticed that some large documents
are not having all their content properly indexed, but just [the starting]
part of them.
	After some debuging I've found out that StandardTokenizer reads up
to 10001 tokens from the reader.

	Have anybody went through something like that before ? What should I
do as a workaround ?

Thanks !

Flavio

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Problems with StandardTokenizer

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Please check the Lucene's jGuru FAQ, your question is answered there.

Otis

--- Flavio Eduardo de Cordova <fl...@datasul.com.br> wrote:
> People...
> 
> 	I've created a custom analyser that uses the StandardTokenizer class
> to get the tokens from the reader.
> 	It seemed to work fine but I just noticed that some large documents
> are not having all their content properly indexed, but just [the
> starting]
> part of them.
> 	After some debuging I've found out that StandardTokenizer reads up
> to 10001 tokens from the reader.
> 
> 	Have anybody went through something like that before ? What should I
> do as a workaround ?
> 
> Thanks !
> 
> Flavio
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org