You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by "Sahand.T" <sa...@gmail.com> on 2012/07/21 03:42:30 UTC

How does hunspell deal with dots in wordlists?

Hi

I'm about to create a wordlist and am considering including
abbreviations in the wordlist. Before I do that I need to know how
hunspell deals with dots (.) in the wordlist. Are dots even allowed in
the wordlist for hunspell? If so, can I even write an abbreviation
like "O.K." with a dot after the final letter and have hunspell
correct "O.K" to "O.K."?

I tried this:

-------------------------

$ test.dic
1
O.K.

$ test.txt
O.K
O.K.

$ analyze test.aff test.dic test.txt
> O.K
Unknown word.
> O.K.

-------------------------

The "O.K" Turned up as an unknown word and the "O.K." (with dot in
end) didn't show anything at all. What does that mean?

If I change the OK in the dic file to "O.K" (without final dot)
everything is fine:

-------------------------

$ test.dic
1
O.K

$ analyze test.aff test.dic test.txt
> O.K
analyze(O.K) =  st:O.K
stem(O.K) = O.K
> O.K.
analyze(O.K.) =  st:O.K
analyze(O.K.) =  st:O.K
stem(O.K.) = O.K

-------------------------

The problem here is that it doesn't correct "O.K" to "O.K." which is
what I want.

Thanks

/S.Taher

Re: How does hunspell deal with dots in wordlists?

Posted by Rob Weir <ro...@apache.org>.
On Sat, Jul 21, 2012 at 11:21 AM, Sahand.T <sa...@gmail.com> wrote:
> Tj: Got it, I thought hunspell questions were ok to ask here since
> I've seen some before.
>

You might try our localization list as well:

http://incubator.apache.org/openofficeorg/mailing-lists.html#localization-mailing-list

Regards,

-Rob

> David McKay: It was just an example to show what I wanted to do (which
> is write any abbreviation with dots).
>
> Thanks
>
> S.T.
> 2012/7/21 David McKay <dm...@btconnect.com>:
>>
>> On 21/07/12 11:47, tj wrote:
>>>
>>> Although used by AOO, Hunspell is not an Apache product. Google is your
>>> friend. --/tj/
>>>
>>> On 7/20/2012 21:42, Sahand.T wrote:
>>>>
>>>> Hi
>>>>
>>>> I'm about to create a wordlist and am considering including
>>>> abbreviations in the wordlist. Before I do that I need to know how
>>>> hunspell deals with dots (.) in the wordlist. Are dots even allowed in
>>>> the wordlist for hunspell? If so, can I even write an abbreviation
>>>> like "O.K." with a dot after the final letter and have hunspell
>>>> correct "O.K" to "O.K."?
>>>>
>>>> I tried this:
>>>>
>>>> -------------------------
>>>>
>>>> $ test.dic
>>>> 1
>>>> O.K.
>>>>
>>>> $ test.txt
>>>> O.K
>>>> O.K.
>>>>
>>>> $ analyze test.aff test.dic test.txt
>>>>>
>>>>> O.K
>>>>
>>>> Unknown word.
>>>>>
>>>>> O.K.
>>>>
>>>>
>>>> -------------------------
>>>>
>>>> The "O.K" Turned up as an unknown word and the "O.K." (with dot in
>>>> end) didn't show anything at all. What does that mean?
>>>>
>>>> If I change the OK in the dic file to "O.K" (without final dot)
>>>> everything is fine:
>>>>
>>>> -------------------------
>>>>
>>>> $ test.dic
>>>> 1
>>>> O.K
>>>>
>>>> $ analyze test.aff test.dic test.txt
>>>>>
>>>>> O.K
>>>>
>>>> analyze(O.K) =  st:O.K
>>>> stem(O.K) = O.K
>>>>>
>>>>> O.K.
>>>>
>>>> analyze(O.K.) =  st:O.K
>>>> analyze(O.K.) =  st:O.K
>>>> stem(O.K.) = O.K
>>>>
>>>> -------------------------
>>>>
>>>> The problem here is that it doesn't correct "O.K" to "O.K." which is
>>>> what I want.
>>>>
>>>> Thanks
>>>>
>>>> /S.Taher
>>>>
>>>>
>> I don't think that O.K. would be the correct form, usually it is written OK
>> on its own. The full stop after the O and the K would imply that the O and
>> the K are the first letters of words starting with O and K respectively, but
>> OK is actually shorthand for 'okay', which is a single word. There is a
>> theory that OK was originally an abbreviation of the purposely misspelled
>> (for comic effect) Oll Korrect. That might or might not be true, but either
>> way I believe the correct modern usage to be OK with no full stops.
>>
>> Dave.
>>>
>>>
>>>
>>>
>>>
>>
>>

Re: How does hunspell deal with dots in wordlists?

Posted by "Sahand.T" <sa...@gmail.com>.
Tj: Got it, I thought hunspell questions were ok to ask here since
I've seen some before.

David McKay: It was just an example to show what I wanted to do (which
is write any abbreviation with dots).

Thanks

S.T.
2012/7/21 David McKay <dm...@btconnect.com>:
>
> On 21/07/12 11:47, tj wrote:
>>
>> Although used by AOO, Hunspell is not an Apache product. Google is your
>> friend. --/tj/
>>
>> On 7/20/2012 21:42, Sahand.T wrote:
>>>
>>> Hi
>>>
>>> I'm about to create a wordlist and am considering including
>>> abbreviations in the wordlist. Before I do that I need to know how
>>> hunspell deals with dots (.) in the wordlist. Are dots even allowed in
>>> the wordlist for hunspell? If so, can I even write an abbreviation
>>> like "O.K." with a dot after the final letter and have hunspell
>>> correct "O.K" to "O.K."?
>>>
>>> I tried this:
>>>
>>> -------------------------
>>>
>>> $ test.dic
>>> 1
>>> O.K.
>>>
>>> $ test.txt
>>> O.K
>>> O.K.
>>>
>>> $ analyze test.aff test.dic test.txt
>>>>
>>>> O.K
>>>
>>> Unknown word.
>>>>
>>>> O.K.
>>>
>>>
>>> -------------------------
>>>
>>> The "O.K" Turned up as an unknown word and the "O.K." (with dot in
>>> end) didn't show anything at all. What does that mean?
>>>
>>> If I change the OK in the dic file to "O.K" (without final dot)
>>> everything is fine:
>>>
>>> -------------------------
>>>
>>> $ test.dic
>>> 1
>>> O.K
>>>
>>> $ analyze test.aff test.dic test.txt
>>>>
>>>> O.K
>>>
>>> analyze(O.K) =  st:O.K
>>> stem(O.K) = O.K
>>>>
>>>> O.K.
>>>
>>> analyze(O.K.) =  st:O.K
>>> analyze(O.K.) =  st:O.K
>>> stem(O.K.) = O.K
>>>
>>> -------------------------
>>>
>>> The problem here is that it doesn't correct "O.K" to "O.K." which is
>>> what I want.
>>>
>>> Thanks
>>>
>>> /S.Taher
>>>
>>>
> I don't think that O.K. would be the correct form, usually it is written OK
> on its own. The full stop after the O and the K would imply that the O and
> the K are the first letters of words starting with O and K respectively, but
> OK is actually shorthand for 'okay', which is a single word. There is a
> theory that OK was originally an abbreviation of the purposely misspelled
> (for comic effect) Oll Korrect. That might or might not be true, but either
> way I believe the correct modern usage to be OK with no full stops.
>
> Dave.
>>
>>
>>
>>
>>
>
>

Re: How does hunspell deal with dots in wordlists?

Posted by David McKay <dm...@btconnect.com>.
On 21/07/12 11:47, tj wrote:
> Although used by AOO, Hunspell is not an Apache product. Google is 
> your friend. --/tj/
>
> On 7/20/2012 21:42, Sahand.T wrote:
>> Hi
>>
>> I'm about to create a wordlist and am considering including
>> abbreviations in the wordlist. Before I do that I need to know how
>> hunspell deals with dots (.) in the wordlist. Are dots even allowed in
>> the wordlist for hunspell? If so, can I even write an abbreviation
>> like "O.K." with a dot after the final letter and have hunspell
>> correct "O.K" to "O.K."?
>>
>> I tried this:
>>
>> -------------------------
>>
>> $ test.dic
>> 1
>> O.K.
>>
>> $ test.txt
>> O.K
>> O.K.
>>
>> $ analyze test.aff test.dic test.txt
>>> O.K
>> Unknown word.
>>> O.K.
>>
>> -------------------------
>>
>> The "O.K" Turned up as an unknown word and the "O.K." (with dot in
>> end) didn't show anything at all. What does that mean?
>>
>> If I change the OK in the dic file to "O.K" (without final dot)
>> everything is fine:
>>
>> -------------------------
>>
>> $ test.dic
>> 1
>> O.K
>>
>> $ analyze test.aff test.dic test.txt
>>> O.K
>> analyze(O.K) =  st:O.K
>> stem(O.K) = O.K
>>> O.K.
>> analyze(O.K.) =  st:O.K
>> analyze(O.K.) =  st:O.K
>> stem(O.K.) = O.K
>>
>> -------------------------
>>
>> The problem here is that it doesn't correct "O.K" to "O.K." which is
>> what I want.
>>
>> Thanks
>>
>> /S.Taher
>>
>>
I don't think that O.K. would be the correct form, usually it is written 
OK on its own. The full stop after the O and the K would imply that the 
O and the K are the first letters of words starting with O and K 
respectively, but OK is actually shorthand for 'okay', which is a single 
word. There is a theory that OK was originally an abbreviation of the 
purposely misspelled (for comic effect) Oll Korrect. That might or might 
not be true, but either way I believe the correct modern usage to be OK 
with no full stops.

Dave.
>
>
>
>



Re: How does hunspell deal with dots in wordlists?

Posted by tj <tj...@apache.org>.
Although used by AOO, Hunspell is not an Apache product. Google is your 
friend. --/tj/

On 7/20/2012 21:42, Sahand.T wrote:
> Hi
>
> I'm about to create a wordlist and am considering including
> abbreviations in the wordlist. Before I do that I need to know how
> hunspell deals with dots (.) in the wordlist. Are dots even allowed in
> the wordlist for hunspell? If so, can I even write an abbreviation
> like "O.K." with a dot after the final letter and have hunspell
> correct "O.K" to "O.K."?
>
> I tried this:
>
> -------------------------
>
> $ test.dic
> 1
> O.K.
>
> $ test.txt
> O.K
> O.K.
>
> $ analyze test.aff test.dic test.txt
>> O.K
> Unknown word.
>> O.K.
>
> -------------------------
>
> The "O.K" Turned up as an unknown word and the "O.K." (with dot in
> end) didn't show anything at all. What does that mean?
>
> If I change the OK in the dic file to "O.K" (without final dot)
> everything is fine:
>
> -------------------------
>
> $ test.dic
> 1
> O.K
>
> $ analyze test.aff test.dic test.txt
>> O.K
> analyze(O.K) =  st:O.K
> stem(O.K) = O.K
>> O.K.
> analyze(O.K.) =  st:O.K
> analyze(O.K.) =  st:O.K
> stem(O.K.) = O.K
>
> -------------------------
>
> The problem here is that it doesn't correct "O.K" to "O.K." which is
> what I want.
>
> Thanks
>
> /S.Taher
>
>