You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Julian Motz <me...@julianmotz.com> on 2016/11/17 16:32:16 UTC
ASCIIFoldingFilter
Hello together,
We're currently discussing
<https://github.com/diacritics/database/issues/1> about the usage of the
ASCIIFoldingFilter
<https://lucene.apache.org/core/3_6_2/api/core/org/apache/lucene/analysis/ASCIIFoldingFilter.html>
class in our diacritics project. This project will be about mapping
diacritics (e.g. "�") to their associated ASCII characters (e.g. "u" and
"ue" in this example).
The ASCIIFoldingFilter class would help us a lot and therefore I hope
you can answer the following questions we have:
1. The ASCIIFoldingFilter class only maps diacritics to their ASCII
base characters, e.g. "�" => "u", not to their ASCII characters in
the associated language, e.g. "ue" in this case. Why have you
excluded these mappings in the class?
2. Does anyone of you know if German and Norwegian are the only
languages that have such language specific mappings (e.g. "�" =>
"ue" instead of "�" => "u")?
3. Can we use the data in your class (Apache license) in our project
(MIT license) with naming the copyright inside the file in our
repository, but not in the end product that users will have when
using our project? The project would use the data in the build to
generate a JSON file.
Thanks in advance.
Cheers,
Julian