You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@openoffice.apache.org by Jürgen Schmidt <jo...@googlemail.com> on 2012/01/16 15:45:07 UTC

[RELEASE]: help wanted to define default dictionaries for languages

Hi,

[based on the assumption that we get the approval to bundle the 
dictionaries/thesaurus with our binary releases]

I am not really familiar with thwe dictionaries/thesaurus and would like 
to ask for help to define a list of default dictionaries/thesaurus for 
each language if possible.

Based on this list we have to work on a mechanism to include the 
dictionaries in our binary releases.

I think about a simple list where we define the default dictionary for 
each language. Something like

[DICTIONARIES]
de -> (German (de-DE frami) dictionaries), 
http://extensions.services.openoffice.org/en/download/5092, <name of 
download file>
de-DE -> (German (de-DE frami) dictionaries), 
http://extensions.services.openoffice.org/en/download/5092, <name of 
download file>
de-AT -> (German (de-AT frami) dictionaries), link not available- 
network error, <name of download file>
de-CH ->  (German (de-CH frami) dictionaries), 
http://extensions.services.openoffice.org/en/download/5091, <name of 
download file>
en -> (English dictionaries with fixed dash handling and new ligature 
and phonetic suggestion support), 
http://extensions.services.openoffice.org/en/download/3814, <name of 
download file>
en-US -> (US English Spell Checking Dictionary), 
http://extensions.services.openoffice.org/en/download/1471, <name of the 
download file>
...

[THESAURUS]
...


[this example list is based on wild guessing and browsing the available 
dictionaries in the ext repo, 
http://extensions.services.openoffice.org/en/dictionaries ]

I propose that we collect this list in the wiki first and prepare later 
on a simple text file that we can check-in in svn for the build process. 
As place for the wiki page I suggest a sub-page under 
https://cwiki.apache.org/confluence/display/OOOUSERS/Project+Planning 
with the name "Bundled Default Language Tools" or so.

We need also a volunteer who analyze the mechanism to include these 
extensions in our binary releases ...

Does this make sense or does anybody have a better idea how to move 
forward with this?

Juergen




Re: [RELEASE]: help wanted to define default dictionaries for languages

Posted by Jürgen Schmidt <jo...@googlemail.com>.
On 1/17/12 7:38 PM, RGB ES wrote:
> 2012/1/17 Ariel Constenla-Haile<ar...@apache.org>
>
>>
>> sure you're facing this problem:
>> http://openoffice.org/projects/www/lists/dev/archive/2011-02/message/20
>>
>> For testing, I took all the dicts and made an extension for the 20
>> countries, every country has its own spelling dict, and all share the
>> hyphen and thes dicts:
>>
>> http://people.apache.org/~arielch/packages/Spell-Hyphen-Thes-es_ALL.oxt
>>
>>
>>
> Cool! Thanks!
>

I would like again to ping if there is somebody who is interested to 
drive this forward. I think we need a well defined list where we can 
rely on and that will be the base for further packaging processes. 
Anybody with enough knowledge to work on this?

Juergen


Re: [RELEASE]: help wanted to define default dictionaries for languages

Posted by RGB ES <rg...@gmail.com>.
2012/1/17 Ariel Constenla-Haile <ar...@apache.org>

>
> sure you're facing this problem:
> http://openoffice.org/projects/www/lists/dev/archive/2011-02/message/20
>
> For testing, I took all the dicts and made an extension for the 20
> countries, every country has its own spelling dict, and all share the
> hyphen and thes dicts:
>
> http://people.apache.org/~arielch/packages/Spell-Hyphen-Thes-es_ALL.oxt
>
>
>
Cool! Thanks!

Re: [RELEASE]: help wanted to define default dictionaries for languages

Posted by RGB ES <rg...@gmail.com>.
2012/1/17 Ariel Constenla-Haile <ar...@apache.org>

>
> For testing, I took all the dicts and made an extension for the 20
> countries, every country has its own spelling dict, and all share the
> hyphen and thes dicts:
>
> http://people.apache.org/~arielch/packages/Spell-Hyphen-Thes-es_ALL.oxt
>
> It seems that the file dictionaries.xcu is missing from the extension... :)

Re: [RELEASE]: help wanted to define default dictionaries for languages

Posted by Ariel Constenla-Haile <ar...@apache.org>.
Hi Ricardo,

On Mon, Jan 16, 2012 at 11:43:56PM +0100, RGB ES wrote:
> The Spanish thesaurus and hyphenation patterns can be found here:
> http://openthes-es.berlios.de/
> The spellchecker dictionary can be found here:
> http://forja.rediris.es/frs/?group_id=341&release_id=1000
> but I need to do some tests: I have problems installing it on recent dev
> builds...

sure you're facing this problem:
http://openoffice.org/projects/www/lists/dev/archive/2011-02/message/20

For testing, I took all the dicts and made an extension for the 20
countries, every country has its own spelling dict, and all share the
hyphen and thes dicts:

http://people.apache.org/~arielch/packages/Spell-Hyphen-Thes-es_ALL.oxt


Regards
-- 
Ariel Constenla-Haile
La Plata, Argentina

RE: [RELEASE]: help wanted to define default dictionaries for languages

Posted by "Dennis E. Hamilton" <de...@acm.org>.
That's why I said *minimum*.

-----Original Message-----
From: Jürgen Schmidt [mailto:jogischmidt@googlemail.com] 
Sent: Tuesday, January 17, 2012 08:16
To: ooo-dev@incubator.apache.org
Subject: Re: [RELEASE]: help wanted to define default dictionaries for languages

On 1/17/12 5:12 PM, Dennis E. Hamilton wrote:
> I'm going by what is being covered with developer snapshots at the moment:
> <https://cwiki.apache.org/confluence/display/OOOUSERS/AOO+3.4+Unofficial+Developer+Snapshots>

that is of course a wrong assumption, the list of languages there is 
only a first guess of languages where we have more or less identified 
volunteers who want to take look on it.

It doesn't make sense to built all possible languages right now, 
especially when we don't have the pootle data integrated and know how 
complete it is.

Juergen


>
>   - Dennis
>
> -----Original Message-----
> From: Andre Fischer [mailto:af@a-w-f.de]
> Sent: Tuesday, January 17, 2012 04:18
> To: ooo-dev@incubator.apache.org
> Subject: Re: [RELEASE]: help wanted to define default dictionaries for languages
>
> Hi Dennis,
>
> On 17.01.2012 02:14, Dennis E. Hamilton wrote:
>> If you look at the localization languages that are on the current list for built-by-AOO packages as part of a release, that effectively identifies the *minimum* set of languages that writing tools are required for.
>
> Can you provide a link for this?
>
> Thanks,
> Andre
>
>> [...]
>


Re: [RELEASE]: help wanted to define default dictionaries for languages

Posted by Jürgen Schmidt <jo...@googlemail.com>.
On 1/17/12 5:12 PM, Dennis E. Hamilton wrote:
> I'm going by what is being covered with developer snapshots at the moment:
> <https://cwiki.apache.org/confluence/display/OOOUSERS/AOO+3.4+Unofficial+Developer+Snapshots>

that is of course a wrong assumption, the list of languages there is 
only a first guess of languages where we have more or less identified 
volunteers who want to take look on it.

It doesn't make sense to built all possible languages right now, 
especially when we don't have the pootle data integrated and know how 
complete it is.

Juergen


>
>   - Dennis
>
> -----Original Message-----
> From: Andre Fischer [mailto:af@a-w-f.de]
> Sent: Tuesday, January 17, 2012 04:18
> To: ooo-dev@incubator.apache.org
> Subject: Re: [RELEASE]: help wanted to define default dictionaries for languages
>
> Hi Dennis,
>
> On 17.01.2012 02:14, Dennis E. Hamilton wrote:
>> If you look at the localization languages that are on the current list for built-by-AOO packages as part of a release, that effectively identifies the *minimum* set of languages that writing tools are required for.
>
> Can you provide a link for this?
>
> Thanks,
> Andre
>
>> [...]
>


RE: [RELEASE]: help wanted to define default dictionaries for languages

Posted by "Dennis E. Hamilton" <de...@acm.org>.
I'm going by what is being covered with developer snapshots at the moment:
<https://cwiki.apache.org/confluence/display/OOOUSERS/AOO+3.4+Unofficial+Developer+Snapshots>

 - Dennis

-----Original Message-----
From: Andre Fischer [mailto:af@a-w-f.de] 
Sent: Tuesday, January 17, 2012 04:18
To: ooo-dev@incubator.apache.org
Subject: Re: [RELEASE]: help wanted to define default dictionaries for languages

Hi Dennis,

On 17.01.2012 02:14, Dennis E. Hamilton wrote:
> If you look at the localization languages that are on the current list for built-by-AOO packages as part of a release, that effectively identifies the *minimum* set of languages that writing tools are required for.

Can you provide a link for this?

Thanks,
Andre

>[...]


Re: [RELEASE]: help wanted to define default dictionaries for languages

Posted by Andre Fischer <af...@a-w-f.de>.
Hi Dennis,

On 17.01.2012 02:14, Dennis E. Hamilton wrote:
> If you look at the localization languages that are on the current list for built-by-AOO packages as part of a release, that effectively identifies the *minimum* set of languages that writing tools are required for.

Can you provide a link for this?

Thanks,
Andre

>[...]

RE: [RELEASE]: help wanted to define default dictionaries for languages

Posted by "Dennis E. Hamilton" <de...@acm.org>.
If you look at the localization languages that are on the current list for built-by-AOO packages as part of a release, that effectively identifies the *minimum* set of languages that writing tools are required for.

In addition, some of the built packages cover other languages by default simply because it is common to want to *author* in some other languages even though the UI localization is for that of the built package.  For example, the English-language 

So there is a known set already, at least for an essential minimum.  There are, however, writing aids for more languages and dialects than there are UI localizations.

Of course, any localized package can have its writing aids augmented beyond any bundled ones by user download of additional extensions and templates, including the writing aids for other languages (and localizations too).  

 - Dennis

TECHNICAL NOTE: Currently, instead of packaging the relevant .OXT files in the binary and silently integrating them, the built-in-ones are shipped in a pre-integrated form.  There is some worthwhile simplification if the copy stored in the install location were kept as the .OXT with the content cached the same as for an user-installed .OXT.  (The bundled .OXT could still be locked down the way the pre-integrated form is now.  User-installed .OXT files are unloaded into a cache without retention of the .OXT in the application.)

This would also improve the clean separation of any non-Apache-licensed material among the .OXT artifacts.

-----Original Message-----
From: RGB ES [mailto:rgb.mldc@gmail.com] 
Sent: Monday, January 16, 2012 14:44
To: ooo-dev@incubator.apache.org
Subject: Re: [RELEASE]: help wanted to define default dictionaries for languages

2012/1/16 Jürgen Schmidt <jo...@googlemail.com>

> Hi,
>
> [based on the assumption that we get the approval to bundle the
> dictionaries/thesaurus with our binary releases]
>
> I am not really familiar with thwe dictionaries/thesaurus and would like
> to ask for help to define a list of default dictionaries/thesaurus for each
> language if possible.
>
> Based on this list we have to work on a mechanism to include the
> dictionaries in our binary releases.
>
> I think about a simple list where we define the default dictionary for
> each language. Something like
>
> [DICTIONARIES]
> de -> (German (de-DE frami) dictionaries), http://extensions.services.**
> openoffice.org/en/download/**5092<http://extensions.services.openoffice.org/en/download/5092>,
> <name of download file>
> de-DE -> (German (de-DE frami) dictionaries), http://extensions.services.*
> *openoffice.org/en/download/**5092<http://extensions.services.openoffice.org/en/download/5092>,
> <name of download file>
> de-AT -> (German (de-AT frami) dictionaries), link not available- network
> error, <name of download file>
> de-CH ->  (German (de-CH frami) dictionaries), http://extensions.services.
> **openoffice.org/en/download/**5091<http://extensions.services.openoffice.org/en/download/5091>,
> <name of download file>
> en -> (English dictionaries with fixed dash handling and new ligature and
> phonetic suggestion support), http://extensions.services.**
> openoffice.org/en/download/**3814<http://extensions.services.openoffice.org/en/download/3814>,
> <name of download file>
> en-US -> (US English Spell Checking Dictionary),
> http://extensions.services.**openoffice.org/en/download/**1471<http://extensions.services.openoffice.org/en/download/1471>,
> <name of the download file>
> ...
>
> [THESAURUS]
> ...
>
>
> [this example list is based on wild guessing and browsing the available
> dictionaries in the ext repo, http://extensions.services.**
> openoffice.org/en/dictionaries<http://extensions.services.openoffice.org/en/dictionaries>]
>
> I propose that we collect this list in the wiki first and prepare later on
> a simple text file that we can check-in in svn for the build process. As
> place for the wiki page I suggest a sub-page under
> https://cwiki.apache.org/**confluence/display/OOOUSERS/**Project+Planning<https://cwiki.apache.org/confluence/display/OOOUSERS/Project+Planning>with the name "Bundled Default Language Tools" or so.
>
> We need also a volunteer who analyze the mechanism to include these
> extensions in our binary releases ...
>
> Does this make sense or does anybody have a better idea how to move
> forward with this?
>
> Juergen
>
>
>
The Spanish thesaurus and hyphenation patterns can be found here:
http://openthes-es.berlios.de/
The spellchecker dictionary can be found here:
http://forja.rediris.es/frs/?group_id=341&release_id=1000
but I need to do some tests: I have problems installing it on recent dev
builds...

Cheers
Ricardo


Re: [RELEASE]: help wanted to define default dictionaries for languages

Posted by RGB ES <rg...@gmail.com>.
2012/1/16 Jürgen Schmidt <jo...@googlemail.com>

> Hi,
>
> [based on the assumption that we get the approval to bundle the
> dictionaries/thesaurus with our binary releases]
>
> I am not really familiar with thwe dictionaries/thesaurus and would like
> to ask for help to define a list of default dictionaries/thesaurus for each
> language if possible.
>
> Based on this list we have to work on a mechanism to include the
> dictionaries in our binary releases.
>
> I think about a simple list where we define the default dictionary for
> each language. Something like
>
> [DICTIONARIES]
> de -> (German (de-DE frami) dictionaries), http://extensions.services.**
> openoffice.org/en/download/**5092<http://extensions.services.openoffice.org/en/download/5092>,
> <name of download file>
> de-DE -> (German (de-DE frami) dictionaries), http://extensions.services.*
> *openoffice.org/en/download/**5092<http://extensions.services.openoffice.org/en/download/5092>,
> <name of download file>
> de-AT -> (German (de-AT frami) dictionaries), link not available- network
> error, <name of download file>
> de-CH ->  (German (de-CH frami) dictionaries), http://extensions.services.
> **openoffice.org/en/download/**5091<http://extensions.services.openoffice.org/en/download/5091>,
> <name of download file>
> en -> (English dictionaries with fixed dash handling and new ligature and
> phonetic suggestion support), http://extensions.services.**
> openoffice.org/en/download/**3814<http://extensions.services.openoffice.org/en/download/3814>,
> <name of download file>
> en-US -> (US English Spell Checking Dictionary),
> http://extensions.services.**openoffice.org/en/download/**1471<http://extensions.services.openoffice.org/en/download/1471>,
> <name of the download file>
> ...
>
> [THESAURUS]
> ...
>
>
> [this example list is based on wild guessing and browsing the available
> dictionaries in the ext repo, http://extensions.services.**
> openoffice.org/en/dictionaries<http://extensions.services.openoffice.org/en/dictionaries>]
>
> I propose that we collect this list in the wiki first and prepare later on
> a simple text file that we can check-in in svn for the build process. As
> place for the wiki page I suggest a sub-page under
> https://cwiki.apache.org/**confluence/display/OOOUSERS/**Project+Planning<https://cwiki.apache.org/confluence/display/OOOUSERS/Project+Planning>with the name "Bundled Default Language Tools" or so.
>
> We need also a volunteer who analyze the mechanism to include these
> extensions in our binary releases ...
>
> Does this make sense or does anybody have a better idea how to move
> forward with this?
>
> Juergen
>
>
>
The Spanish thesaurus and hyphenation patterns can be found here:
http://openthes-es.berlios.de/
The spellchecker dictionary can be found here:
http://forja.rediris.es/frs/?group_id=341&release_id=1000
but I need to do some tests: I have problems installing it on recent dev
builds...

Cheers
Ricardo

Re: [RELEASE]: help wanted to define default dictionaries for languages

Posted by Jürgen Schmidt <jo...@googlemail.com>.
On 2/6/12 12:18 AM, Andrea Pescetti wrote:
> On 16/01/2012 Jürgen Schmidt wrote:
>> I am not really familiar with thwe dictionaries/thesaurus and would like
>> to ask for help to define a list of default dictionaries/thesaurus for
>> each language if possible.
>
> We should start from the dictionaries that were already included in
> OpenOffice.org 3.4 beta. Besides some special cases, like the Spanish
> dictionaries already discussed here, they will be the dictionaries users
> are used to.

we can't store them in our repo and have to download them from either 
the extension repo (that I would prefer if possible) or from somewhere 
else. But we need some kind of list that I have mentioned. Base don such 
a list we can start to work on the build process and the integration.

And yes we need also somebody who is interested to work on this ;-)

>
>> Based on this list we have to work on a mechanism to include the
>> dictionaries in our binary releases. ...
>> I propose that we collect this list in the wiki first and prepare later
>> on a simple text file that we can check-in in svn for the build process.
>> As place for the wiki page I suggest a sub-page under
>> https://cwiki.apache.org/confluence/display/OOOUSERS/Project+Planning
>> with the name "Bundled Default Language Tools" or so.
>
> I started
> https://cwiki.apache.org/confluence/display/OOOUSERS/Bundled+Writing+Aids
> (short: http://s.apache.org/rf )
> but matching all dictionaries with their upstream sources will take some
> time, especially considering that the Extensions site is not stable yet.

thanks for starting the wiki page.

That is the reason why I have tried to start early. I hope that in the 
future the extension site will be more stable and can be used as an 
reliable source for downloading these extensions.

>
>> We need also a volunteer who analyze the mechanism to include these
>> extensions in our binary releases ...
>
> If we manage to download them, everything should be ready, with two
> major obstacles:
> 1) Depending on the availability of the Extensions site (or of a dozen
> different sites) to download the dictionaries at build time is problematic.
as mentioned before that will be hopefully no problem in the future

> 2) Dictionaries included in the OpenOffice.org sources were repackaged:
> for example, files in the Italian writing aids extension were moved and
> renamed; so it could be that we cannot just download extensions as they
> are, and if we need some logic/scripts we must store it somewhere.

that is something that we don't want in the future. I think we can only 
download and include them as they come from the extension repo (or 
somewhere else) and can bundle them with our binary.

Well that might be different for ALv2 licensed stuff but ideally we 
would handle all dictionaries equal.

Juergen

>
> Regards,
> Andrea.


Re[2]: [RELEASE]: help wanted to define default dictionaries for languages

Posted by Yakov Reztsov <ya...@mail.ru>.


06 февраля 2012, 03:19 от Andrea Pescetti :
> On 16/01/2012 Jürgen Schmidt wrote:
> > I am not really familiar with thwe dictionaries/thesaurus and would like
> > to ask for help to define a list of default dictionaries/thesaurus for
> > each language if possible.
> 
> We should start from the dictionaries that were already included in
> OpenOffice.org 3.4 beta. Besides some special cases, like the Spanish
> dictionaries already discussed here, they will be the dictionaries users
> are used to.
> 
> > Based on this list we have to work on a mechanism to include the
> > dictionaries in our binary releases. ...
> > I propose that we collect this list in the wiki first and prepare later
> > on a simple text file that we can check-in in svn for the build process.
> > As place for the wiki page I suggest a sub-page under
> > https://cwiki.apache.org/confluence/display/OOOUSERS/Project+Planning
> > with the name "Bundled Default Language Tools" or so.
> 
> I started
> https://cwiki.apache.org/confluence/display/OOOUSERS/Bundled+Writing+Aids
> (short: http://s.apache.org/rf )
> but matching all dictionaries with their upstream sources will take some
> time, especially considering that the Extensions site is not stable yet.

Russian spellcheck dictionary included to  OpenOffice.org 3.4 beta.
Great!


Re: [RELEASE]: help wanted to define default dictionaries for languages

Posted by Andrea Pescetti <pe...@apache.org>.
On 16/01/2012 Jürgen Schmidt wrote:
> I am not really familiar with thwe dictionaries/thesaurus and would like
> to ask for help to define a list of default dictionaries/thesaurus for
> each language if possible.

We should start from the dictionaries that were already included in 
OpenOffice.org 3.4 beta. Besides some special cases, like the Spanish 
dictionaries already discussed here, they will be the dictionaries users 
are used to.

> Based on this list we have to work on a mechanism to include the
> dictionaries in our binary releases. ...
> I propose that we collect this list in the wiki first and prepare later
> on a simple text file that we can check-in in svn for the build process.
> As place for the wiki page I suggest a sub-page under
> https://cwiki.apache.org/confluence/display/OOOUSERS/Project+Planning
> with the name "Bundled Default Language Tools" or so.

I started
https://cwiki.apache.org/confluence/display/OOOUSERS/Bundled+Writing+Aids
(short: http://s.apache.org/rf )
but matching all dictionaries with their upstream sources will take some 
time, especially considering that the Extensions site is not stable yet.

> We need also a volunteer who analyze the mechanism to include these
> extensions in our binary releases ...

If we manage to download them, everything should be ready, with two 
major obstacles:
1) Depending on the availability of the Extensions site (or of a dozen 
different sites) to download the dictionaries at build time is problematic.
2) Dictionaries included in the OpenOffice.org sources were repackaged: 
for example, files in the Italian writing aids extension were moved and 
renamed; so it could be that we cannot just download extensions as they 
are, and if we need some logic/scripts we must store it somewhere.

Regards,
   Andrea.