You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Alexandre Rafalovitch <ar...@gmail.com> on 2018/07/22 01:27:20 UTC

RefGuide: Are we missing English section in Language Analysis page on purpose?

Hi,

I am looking at: https://lucene.apache.org/solr/guide/7_4/language-analysis.html

And it has a lot of information on individual languages - but not
actually for English. This feels like a curious omission, especially
given that we do have a couple of filters that would be interesting to
mention and are not necessarily obvious (e.g. KStemFilterFactory).

Am I missing something?

Regards,
   Alex.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: RefGuide: Are we missing English section in Language Analysis page on purpose?

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
Well, it is not so much that English is supported, but specifically
which filters and tokenizers and resources we have for English. I am
not sure where else we describe them in such clustered way.

Now that I know it is not an on-purpose thing, I've created the JIRA
(SOLR-12580) and will try to contribute when I have time.

Regards,
   Alex.


On 23 July 2018 at 11:13, Cassandra Targett <ca...@gmail.com> wrote:
> The omission isn't that curious IMO, it's a relatively common assumption on
> the part of native English speakers reading docs written in English that of
> course English is supported.
>
> However, I can see your point if I try to envision reading the Guide as
> someone working in a situation where non-English languages are primary - the
> initial assumption is likely different. If you believe the language
> discussion can be improved by specifically describing best practices for
> English, patches are always welcome.
>
> On Sat, Jul 21, 2018 at 8:28 PM Alexandre Rafalovitch <ar...@gmail.com>
> wrote:
>>
>> Hi,
>>
>> I am looking at:
>> https://lucene.apache.org/solr/guide/7_4/language-analysis.html
>>
>> And it has a lot of information on individual languages - but not
>> actually for English. This feels like a curious omission, especially
>> given that we do have a couple of filters that would be interesting to
>> mention and are not necessarily obvious (e.g. KStemFilterFactory).
>>
>> Am I missing something?
>>
>> Regards,
>>    Alex.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: RefGuide: Are we missing English section in Language Analysis page on purpose?

Posted by Cassandra Targett <ca...@gmail.com>.
The omission isn't that curious IMO, it's a relatively common assumption on
the part of native English speakers reading docs written in English that of
course English is supported.

However, I can see your point if I try to envision reading the Guide as
someone working in a situation where non-English languages are primary -
the initial assumption is likely different. If you believe the language
discussion can be improved by specifically describing best practices for
English, patches are always welcome.

On Sat, Jul 21, 2018 at 8:28 PM Alexandre Rafalovitch <ar...@gmail.com>
wrote:

> Hi,
>
> I am looking at:
> https://lucene.apache.org/solr/guide/7_4/language-analysis.html
>
> And it has a lot of information on individual languages - but not
> actually for English. This feels like a curious omission, especially
> given that we do have a couple of filters that would be interesting to
> mention and are not necessarily obvious (e.g. KStemFilterFactory).
>
> Am I missing something?
>
> Regards,
>    Alex.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>