You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Julian Motz <me...@julianmotz.com> on 2016/11/17 16:32:16 UTC

ASCIIFoldingFilter

Hello together,

We're currently discussing 
<https://github.com/diacritics/database/issues/1> about the usage of the 
ASCIIFoldingFilter 
<https://lucene.apache.org/core/3_6_2/api/core/org/apache/lucene/analysis/ASCIIFoldingFilter.html> 
class in our diacritics project. This project will be about mapping 
diacritics (e.g. "�") to their associated ASCII characters (e.g. "u" and 
"ue" in this example).

The ASCIIFoldingFilter class would help us a lot and therefore I hope 
you can answer the following questions we have:

 1. The ASCIIFoldingFilter class only maps diacritics to their ASCII
    base characters, e.g. "�" => "u", not to their ASCII characters in
    the associated language, e.g. "ue" in this case. Why have you
    excluded these mappings in the class?
 2. Does anyone of you know if German and Norwegian are the only
    languages that have such language specific mappings (e.g. "�" =>
    "ue" instead of "�" => "u")?
 3. Can we use the data in your class (Apache license) in our project
    (MIT license) with naming the copyright inside the file in our
    repository, but not in the end product that users will have when
    using our project? The project would use the data in the build to
    generate a JSON file.

Thanks in advance.

Cheers,
Julian