You are viewing a plain text version of this content. The canonical link for it is here.
Posted to qa@openoffice.apache.org by "Marco A.G.Pinto" <ma...@mail.telepac.pt> on 2016/02/16 22:53:13 UTC

Thesaurus US - 752 synonyms with duplicate meanings

Hello!

Today I have finished coding a bit more of my software Proofing Tool GUI.

I have tested the new "unduplicate simple meanings" on the US thesaurus 
and it found duplicated meanings in 752 synonyms:


Not sure if I should convert the thesaurus to UTF-8 and then remove the 
duplicates... what do you suggest?

I am exhausted and tomorrow I will try to resume the coding and attempt 
to release an official build of PTG for Windows and Linux on Friday.

The good news is that I have fixed the pause issue in Linux by 
converting the remaining of the code into dynamic arrays (the thesaurus 
part) instead of loading all synonyms into the ListIconGadget. I have 
been postponing this for a couple of years or so but I have finally done 
it (it took a long time because I would have to make changes all over 
the code).

Just to share the news!

Thanks for your time!

Kind regards,
       >Marco A.G.Pinto
         -----------------------

-- 

Re: Thesaurus US - 752 synonyms with duplicate meanings

Posted by Andrea Pescetti <pe...@apache.org>.
Marco A.G.Pinto wrote:
> On 16/02/2016 22:56, Andrea Pescetti wrote:
>> May I know what is the definition of a "duplicate" meaning for your
>> tool? ...
> it means for example:
> apple|3:
> one
> two
> one
>
> It means that it would remove the "one" once becoming:
> apple|2:
> one
> two

I see. It's funny that we have this kind of duplicates. The Italian 
thesaurus is already free from this. We maybe have stuff like

apple|2
(everyday usage)|one|two|three
(medical jargon)|two|three|four

where "two" and "three" are repeated but carry two different meanings 
(so the first line might refer to a word in everyday usage and the 
second to a word in medical jargon, where only part of the previous 
synonyms still apply).

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@openoffice.apache.org
For additional commands, e-mail: dev-help@openoffice.apache.org


Re: Thesaurus US - 752 synonyms with duplicate meanings

Posted by "Marco A.G.Pinto" <ma...@mail.telepac.pt>.
Hello!

On 16/02/2016 22:56, Andrea Pescetti wrote:
>> I have tested the new "unduplicate simple meanings" on the US thesaurus
>> and it found duplicated meanings in 752 synonyms:
>
> (moving the conversation to BCC for l10n and QA; interested people can 
> follow-up on dev)
>
> May I know what is the definition of a "duplicate" meaning for your 
> tool? It looks interesting. I may want to test it on the Italian 
> dictionary too.
>

Andrea, it means for example:
apple|3:
one
two
one

It means that it would remove the "one" once becoming:
apple|2:
one
two



>> Not sure if I should convert the thesaurus to UTF-8 and then remove the
>> duplicates... what do you suggest?
>
> I converted the Italian dictionary to UTF-8 long ago without any 
> reported issues. I fail to see how/this is related to a de-duplication 
> of some kind.
>

Converting the dictionary to UTF-8 wouldn't remove the duplicates. I 
just mentioned it because my tool uses UTF-8 and warns that opening 
non-UTF-8 files may lead to damaged characters :-[

To make sure (100%) that no data is lost, I would first need to convert 
it to UTF-8.

:-P

Andrea, on Friday I am planning an official release for PTG (Windows and 
Linux) but you can download a Windows only version from my Dropbox:
https://dl.dropboxusercontent.com/u/30674540/ProofingToolGUI_V0092.zip

It has all the files in the ZIP including the source, images, etc. and 
the two executables for Windows (x64 and x86).

Thanks!

Kind regards,
      >Marco A.G.Pinto
        ------------------------

-- 

Re: Thesaurus US - 752 synonyms with duplicate meanings

Posted by Andrea Pescetti <pe...@apache.org>.
> I have tested the new "unduplicate simple meanings" on the US thesaurus
> and it found duplicated meanings in 752 synonyms:

(moving the conversation to BCC for l10n and QA; interested people can 
follow-up on dev)

May I know what is the definition of a "duplicate" meaning for your 
tool? It looks interesting. I may want to test it on the Italian 
dictionary too.

> Not sure if I should convert the thesaurus to UTF-8 and then remove the
> duplicates... what do you suggest?

I converted the Italian dictionary to UTF-8 long ago without any 
reported issues. I fail to see how/this is related to a de-duplication 
of some kind.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: qa-unsubscribe@openoffice.apache.org
For additional commands, e-mail: qa-help@openoffice.apache.org


Re: Thesaurus US - 752 synonyms with duplicate meanings

Posted by Andrea Pescetti <pe...@apache.org>.
> I have tested the new "unduplicate simple meanings" on the US thesaurus
> and it found duplicated meanings in 752 synonyms:

(moving the conversation to BCC for l10n and QA; interested people can 
follow-up on dev)

May I know what is the definition of a "duplicate" meaning for your 
tool? It looks interesting. I may want to test it on the Italian 
dictionary too.

> Not sure if I should convert the thesaurus to UTF-8 and then remove the
> duplicates... what do you suggest?

I converted the Italian dictionary to UTF-8 long ago without any 
reported issues. I fail to see how/this is related to a de-duplication 
of some kind.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
For additional commands, e-mail: l10n-help@openoffice.apache.org


Re: Thesaurus US - 752 synonyms with duplicate meanings

Posted by Andrea Pescetti <pe...@apache.org>.
> I have tested the new "unduplicate simple meanings" on the US thesaurus
> and it found duplicated meanings in 752 synonyms:

(moving the conversation to BCC for l10n and QA; interested people can 
follow-up on dev)

May I know what is the definition of a "duplicate" meaning for your 
tool? It looks interesting. I may want to test it on the Italian 
dictionary too.

> Not sure if I should convert the thesaurus to UTF-8 and then remove the
> duplicates... what do you suggest?

I converted the Italian dictionary to UTF-8 long ago without any 
reported issues. I fail to see how/this is related to a de-duplication 
of some kind.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@openoffice.apache.org
For additional commands, e-mail: dev-help@openoffice.apache.org