You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Christian Beil <ch...@gmail.com> on 2014/08/20 17:48:29 UTC

Two-pass TokenFilter

Hey guys,

I need a TokenFilter that filters some tokens like the FilteringTokenFilter.
The problem is, in order to do the filtering I need to know all tokens in
advance.

I thought I'll adapt the CachingTokenFilter in order to collect all tokens
in the first pass.
In the second pass it can use this information to filter the tokens.

Or is there a better solution to do this?

Thanks,
Christian

Re: Two-pass TokenFilter

Posted by Christian Beil <ch...@gmail.com>.
Hi Ahmet,

Sorry, I wasn't very clear.
I need a TokenFilter that filters/skips some tokens, just like
FilteringTokenFilter.
To determine which tokens to filter, I first need to know all tokens.
Therefore I need to go through all tokens twice.

I implemented an TwoPassTokenFilter that is very similar to
CachingTokenFilter.
On the first call to incrementToken() it builds a cache and goes through al
tokens for the first pass.
The following calls to incrementToken() build the second pass.
In the second pass I can use information collected in the first pass.

Christian



2014-08-24 13:50 GMT+02:00 Ahmet Arslan <io...@yahoo.com.invalid>:

> Hi,
>
> Can you elaborate more, what do you mean by "I need to know all tokens
> in advance."
>
> Ahmet
>
>
> On Wednesday, August 20, 2014 6:48 PM, Christian Beil <
> christian.a.beil@gmail.com> wrote:
> Hey guys,
>
> I need a TokenFilter that filters some tokens like the
> FilteringTokenFilter.
> The problem is, in order to do the filtering I need to know all tokens in
> advance.
>
> I thought I'll adapt the CachingTokenFilter in order to collect all tokens
> in the first pass.
> In the second pass it can use this information to filter the tokens.
>
> Or is there a better solution to do this?
>
> Thanks,
> Christian
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Two-pass TokenFilter

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.
Hi,

Can you elaborate more, what do you mean by "I need to know all tokens inĀ advance."

Ahmet


On Wednesday, August 20, 2014 6:48 PM, Christian Beil <ch...@gmail.com> wrote:
Hey guys,

I need a TokenFilter that filters some tokens like the FilteringTokenFilter.
The problem is, in order to do the filtering I need to know all tokens in
advance.

I thought I'll adapt the CachingTokenFilter in order to collect all tokens
in the first pass.
In the second pass it can use this information to filter the tokens.

Or is there a better solution to do this?

Thanks,
Christian


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org