You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Koji Sekiguchi <ko...@rondhuit.com> on 2015/02/24 07:40:24 UTC

Tokenizer for Brown Corpus?

Hello,

Doesn't Lucene have a Tokenizer/Analyzer for Brown Corpus?
There doesn't seem to be such tokenizers/analyzers in Lucene.

As I didn't want re-inventing the wheel, so I googled, I got
the list of snippets that include "the quick brown fox..." :)

Koji

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Tokenizer for Brown Corpus?

Posted by Koji Sekiguchi <ko...@rondhuit.com>.
Hi Jack,

Thanks! I'll look at it.

Koji

On 2015/02/24 22:29, Jack Krupansky wrote:
> This is the first mention that I have seen for that corpus on this list.
>
> There seem to be more than a few references when I google for ""brown
> corpus" lucene", such as:
> https://github.com/INL/BlackLab/wiki/Blacklab-query-tool
>
> -- Jack Krupansky
>
> On Tue, Feb 24, 2015 at 1:40 AM, Koji Sekiguchi <koji.sekiguchi@rondhuit.com
>> wrote:
>
>> Hello,
>>
>> Doesn't Lucene have a Tokenizer/Analyzer for Brown Corpus?
>> There doesn't seem to be such tokenizers/analyzers in Lucene.
>>
>> As I didn't want re-inventing the wheel, so I googled, I got
>> the list of snippets that include "the quick brown fox..." :)
>>
>> Koji
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Tokenizer for Brown Corpus?

Posted by Jack Krupansky <ja...@gmail.com>.
This is the first mention that I have seen for that corpus on this list.

There seem to be more than a few references when I google for ""brown
corpus" lucene", such as:
https://github.com/INL/BlackLab/wiki/Blacklab-query-tool

-- Jack Krupansky

On Tue, Feb 24, 2015 at 1:40 AM, Koji Sekiguchi <koji.sekiguchi@rondhuit.com
> wrote:

> Hello,
>
> Doesn't Lucene have a Tokenizer/Analyzer for Brown Corpus?
> There doesn't seem to be such tokenizers/analyzers in Lucene.
>
> As I didn't want re-inventing the wheel, so I googled, I got
> the list of snippets that include "the quick brown fox..." :)
>
> Koji
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>