You are viewing a plain text version of this content. The canonical link for it is here.
Posted to l10n@openoffice.apache.org by Maria Hartmann <ma...@hotmail.de> on 2014/08/05 17:23:32 UTC

Spellchecker for Rhaeto-Romanic



Hello dear OpenOffice-Community, 
I am new here and english isnt my first language, but i try to ask my question about the affixfile development for a Rhaeto-Romanic spellchecker. Rhaeto-Romanic has 5 varieties. To write a spellchecker for Sursilvan, one of the varieties, is the topic of my bachelor thesis. 
I have the wordlist and i am writing the affixfile at the moment. It already works with words which are not replaced completly but just a few character of it. How can I replace words?For example: I have the verb "be" in the wordlist and need the forms "am" and "are". Do I have to write them in the wordlist manually? Or is there a kind of a rule to replace "be" with the other forms?
I would be glad to hear from you,Mary
 		 	   		  

RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.
Hey again...

My next problem is, that if I import a rhaeto-romanian text into OpenOffice writer with rhaeto-romanic as document language I cannot choose the spell check. There is always the error, that the spell check is done. Whats wrong?
I see my spell checker in the extension manager, but cannot choose him as the spell checker during the spell check.
Here is the link to the actual version:
https://www.dropbox.com/s/ognv53u56a085v2/rm_CH.oxt?dl=0

Is anywhere a detailed description available, how hunspell works? I have to describe it in my thesis, so it has to be offical and detailed...

Regards,
Maria

> Date: Sun, 31 Aug 2014 22:13:32 +0200
> From: pescetti@apache.org
> To: l10n@openoffice.apache.org
> Subject: Re: Spellchecker for Rhaeto-Romanic
> 
> On 31/08/2014 Maria Hartmann wrote:
> > What would happen, if somebody writes a spell checker for another
> > dialect and takes the ISO code rm_CH? It has to be incompatible with
> > my one and not an update.
> 
> Nothing bad will happen. Simply, people won't be able to have the two 
> extensions enabled simultaneously (actually, it will work even if they 
> install both, but the behavior can be a bit unexpected).
> 
> > The dic and aff file have to have the same name but the name doesnt
> > have to contain something with the ISO code, right?
> 
> Correct. The file name has no importance. You could name them (stupid 
> example) mydic.dic and mydict.aff and be fine.
> 
> Regards,
>    Andrea.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
> For additional commands, e-mail: l10n-help@openoffice.apache.org
> 
 		 	   		  

Re: Spellchecker for Rhaeto-Romanic

Posted by Andrea Pescetti <pe...@apache.org>.
On 31/08/2014 Maria Hartmann wrote:
> What would happen, if somebody writes a spell checker for another
> dialect and takes the ISO code rm_CH? It has to be incompatible with
> my one and not an update.

Nothing bad will happen. Simply, people won't be able to have the two 
extensions enabled simultaneously (actually, it will work even if they 
install both, but the behavior can be a bit unexpected).

> The dic and aff file have to have the same name but the name doesnt
> have to contain something with the ISO code, right?

Correct. The file name has no importance. You could name them (stupid 
example) mydic.dic and mydict.aff and be fine.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
For additional commands, e-mail: l10n-help@openoffice.apache.org


RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.
Yeah  right, standardised codes are raelly useful to agree with other software. 

I cannot translate the interface, because I dont speak the language. ;) Thats work for somebody else. But if I understand your mail correctly, there is no dictionary associated to the document language, but just the code for standardisation. thats okay. I saw, that there is no spell checker for the standardised writing languge rumantsch grischun in open office - it is available for microsoft - so there willl be no clash with another dialect. I like the idea to implement different spellings of rhaeto-romanic, but thats work for the next update or so...
So I will use rm_CH for my dialect spell checker and choose rhaeto-romanic as document language. 

What would happen, if somebody writes a spell checker for another dialect and takes the ISO code rm_CH? It has to be incompatible with my one and not an update.

The dic and aff file have to have the same name but the name doesnt have to contain something with the ISO code, right?

Thank you!
Maria

> Date: Sun, 31 Aug 2014 17:49:04 +0200
> From: pescetti@apache.org
> To: l10n@openoffice.apache.org
> Subject: Re: Spellchecker for Rhaeto-Romanic
> 
> On 31/08/2014 Maria Hartmann wrote:
> > Oh no, I suspected something like that... Sursilvan has no ISO Code,
> > the only code I came across was in the Registry Of Dialects (ROD)
> > Code:16069.
> 
> We only implement ISO codes as document languages, or so I believe. So
> you will have to use the generic "rm" for your dictionary.
> 
> > Sorry, I dont really understand, how I can get the unoffical code for
> > the rheate-romanian dialect. Is there a system to create a code? Or
> > is it given by microsoft and has to be written in this list?
> > http://opengrok.adfinis-sygroup.org/source/xref/aoo-trunk/main/i18npool/inc/i18npool/lang.h
> 
> There are two different concept of a language in OpenOffice: the
> document language and the interface language. If you wished to translate
> OpenOffice into Sursilvan, we could just invent a code and put it there.
> And this would only be used to denote the "Sursilvan" version of
> OpenOffice. But if you want to use that language for documents, ISO
> codes are needed, since that document can be sent to others and opened
> wth different software, so we need to use standards for that (while for
> an OpenOffice language pack why needn't agre with anyone else, we simply
> invent a code).
> 
> > I could apply it to rm_CH but how would I treat the four other
> > dialects of this language? Is it possible to apply more than one
> > dialect to one language code? The problem is, that the dialects are
> > spoken in the same country. Its another situation as for other
> > languages, which are spoken in different countries. "the language"
> > rhaete-romanic is acutally no language because it consists of five
> > dialects and one standardised writing language. But my spell checker
> > should be for one dialect and its own writing system.
> 
> We have the same issue with Valencian (that, for ISO, is a dialect of 
> Catalan; we reelase OpenOffice Valencian, actually in two variants of 
> Valencian). We can't set the document language to Valencian since it 
> doesn't have an ISO code. So people created dictionaries for Valencian 
> that rely on the document language to be Catalan. Of course, they cannot 
> be used together or with Catalan. See the last part of
> http://svn.apache.org/viewvc/openoffice/trunk/main/extensions.lst?view=markup
> (if you open the extensions, you will see that they all apply to 
> Catalan, so they are incompatible with each other).
> 
> For private use, of course you can do whatever you wish, a popular 
> choice is to set the code to Esperanto but that's really a kludge and we 
> don't include those dictionaries in OpenOffice.
> 
> The French dictionary (same link above) comes with a mechanism to switch 
> between different spellings of French. In a way, this could be an idea 
> if one wants to create a dictionary for Rhaeto-Romanic that includes all 
> the five variants.
> 
> Regards,
>    Andrea.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
> For additional commands, e-mail: l10n-help@openoffice.apache.org
> 
 		 	   		  

Re: Spellchecker for Rhaeto-Romanic

Posted by Andrea Pescetti <pe...@apache.org>.
On 31/08/2014 Maria Hartmann wrote:
> Oh no, I suspected something like that... Sursilvan has no ISO Code,
> the only code I came across was in the Registry Of Dialects (ROD)
> Code:16069.

We only implement ISO codes as document languages, or so I believe. So
you will have to use the generic "rm" for your dictionary.

> Sorry, I dont really understand, how I can get the unoffical code for
> the rheate-romanian dialect. Is there a system to create a code? Or
> is it given by microsoft and has to be written in this list?
> http://opengrok.adfinis-sygroup.org/source/xref/aoo-trunk/main/i18npool/inc/i18npool/lang.h

There are two different concept of a language in OpenOffice: the
document language and the interface language. If you wished to translate
OpenOffice into Sursilvan, we could just invent a code and put it there.
And this would only be used to denote the "Sursilvan" version of
OpenOffice. But if you want to use that language for documents, ISO
codes are needed, since that document can be sent to others and opened
wth different software, so we need to use standards for that (while for
an OpenOffice language pack why needn't agre with anyone else, we simply
invent a code).

> I could apply it to rm_CH but how would I treat the four other
> dialects of this language? Is it possible to apply more than one
> dialect to one language code? The problem is, that the dialects are
> spoken in the same country. Its another situation as for other
> languages, which are spoken in different countries. "the language"
> rhaete-romanic is acutally no language because it consists of five
> dialects and one standardised writing language. But my spell checker
> should be for one dialect and its own writing system.

We have the same issue with Valencian (that, for ISO, is a dialect of 
Catalan; we reelase OpenOffice Valencian, actually in two variants of 
Valencian). We can't set the document language to Valencian since it 
doesn't have an ISO code. So people created dictionaries for Valencian 
that rely on the document language to be Catalan. Of course, they cannot 
be used together or with Catalan. See the last part of
http://svn.apache.org/viewvc/openoffice/trunk/main/extensions.lst?view=markup
(if you open the extensions, you will see that they all apply to 
Catalan, so they are incompatible with each other).

For private use, of course you can do whatever you wish, a popular 
choice is to set the code to Esperanto but that's really a kludge and we 
don't include those dictionaries in OpenOffice.

The French dictionary (same link above) comes with a mechanism to switch 
between different spellings of French. In a way, this could be an idea 
if one wants to create a dictionary for Rhaeto-Romanic that includes all 
the five variants.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
For additional commands, e-mail: l10n-help@openoffice.apache.org


RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.
Oh no, I suspected something like that...
Sursilvan has no ISO Code, the only code I came across was in the Registry Of Dialects (ROD) Code:16069. 
But I dont know how official that is. Anybody here who knows somethings about this registry?
http://globalrecordings.net/research/dialect/16069

Sorry, I dont really understand, how I can get the unoffical code for the rheate-romanian dialect. Is there a system to create a code? Or is it given by microsoft and has to be written in this list?
http://opengrok.adfinis-sygroup.org/source/xref/aoo-trunk/main/i18npool/inc/i18npool/lang.h 

I could apply it to rm_CH but how would I treat the four other dialects of this language? Is it possible to apply more than one dialect to one language code? The problem is, that the dialects are spoken in the same country. Its another situation as for other languages, which are spoken in different countries. "the language" rhaete-romanic is acutally no language because it consists of five dialects and one standardised writing language. But my spell checker should be for one dialect and its own writing system.

Best regards, 
maria



> Date: Fri, 29 Aug 2014 22:05:38 +0200
> From: pescetti@apache.org
> To: l10n@openoffice.apache.org
> Subject: Re: Spellchecker for Rhaeto-Romanic
> 
> On 18/08/2014 Maria Hartmann wrote:
> > Thanks a lot for your tips - it looks much better know, because the
> > name of the extension and the publishers name is shown in the
> > extension manager.
> 
> Hello Maria, thanks for your patience. Does the extension work better now?
> 
> > No, I didnt know that the language
> > has to be available as a document language. Where can I check it? I
> > see "Rätoromanisch" as a possible language during the import of a
> > document to OpenOffice writer.
> 
> It's here:
> http://opengrok.adfinis-sygroup.org/source/xref/aoo-trunk/main/svtools/source/misc/langtab.src
> and indeed you will see that we have an entry for "Rhaeto-Romance".
> 
> > But the spell checker I wrote is for a
> > variety of Rätoromanisch. I cant imagine that Sursilvan is supportet.
> 
> We can support a variety of languages as "unofficial" languages (where 
> I'm not sure there is a precise definition, or that it depends on any 
> reasonable factor other than Microsoft having assigned an ID for it on 
> Windows). You will see them appear as "user" languages here: 
> http://opengrok.adfinis-sygroup.org/source/xref/aoo-trunk/main/i18npool/inc/i18npool/lang.h
> 
> If you have any more information about Sursilvan (does it have an ISO 
> 639 code? See http://en.wikipedia.org/wiki/List_of_ISO_639-3_codes ) we 
> can look at it deeper.
> 
> > Is it a problem? Would my spell checker work even though it isnt a
> > document language? What does it mean to be a document language?
> 
> A spell checker can only work if the language it attaches to is an 
> allowed document language and if the corresponding text (select, then 
> Format - Character) is set to be in that language.
> 
> So to have a working spell checker for Rhaeto-Romanic you will have to 
> specify in description.xml that it applies to rm_CH, see
> http://opengrok.adfinis-sygroup.org/source/xref/aoo-trunk/main/i18npool/source/isolang/isolang.cxx#195
> 
> Or we'll have to find if Sursilvan has some codes that we can use and 
> try to add support for them to the next version of OpenOffice. But in 
> the short term you'd better edit the extension and try to apply it to rm_CH.
> 
> A tip: while developing extensions, at times things can go wrong and you 
> may be left with extensions that can't be uninstalled or updated. The 
> simplest (even if it's not the most elegant) way to get around it is to 
> reset your user profile. See 
> https://forum.openoffice.org/en/forum/viewtopic.php?p=58403
> 
> If you make any further steps, please share your oxt package again and 
> we can look at it.
> 
> Regards,
>    Andrea.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
> For additional commands, e-mail: l10n-help@openoffice.apache.org
> 
 		 	   		  

Re: Spellchecker for Rhaeto-Romanic

Posted by Andrea Pescetti <pe...@apache.org>.
On 18/08/2014 Maria Hartmann wrote:
> Thanks a lot for your tips - it looks much better know, because the
> name of the extension and the publishers name is shown in the
> extension manager.

Hello Maria, thanks for your patience. Does the extension work better now?

> No, I didnt know that the language
> has to be available as a document language. Where can I check it? I
> see "Rätoromanisch" as a possible language during the import of a
> document to OpenOffice writer.

It's here:
http://opengrok.adfinis-sygroup.org/source/xref/aoo-trunk/main/svtools/source/misc/langtab.src
and indeed you will see that we have an entry for "Rhaeto-Romance".

> But the spell checker I wrote is for a
> variety of Rätoromanisch. I cant imagine that Sursilvan is supportet.

We can support a variety of languages as "unofficial" languages (where 
I'm not sure there is a precise definition, or that it depends on any 
reasonable factor other than Microsoft having assigned an ID for it on 
Windows). You will see them appear as "user" languages here: 
http://opengrok.adfinis-sygroup.org/source/xref/aoo-trunk/main/i18npool/inc/i18npool/lang.h

If you have any more information about Sursilvan (does it have an ISO 
639 code? See http://en.wikipedia.org/wiki/List_of_ISO_639-3_codes ) we 
can look at it deeper.

> Is it a problem? Would my spell checker work even though it isnt a
> document language? What does it mean to be a document language?

A spell checker can only work if the language it attaches to is an 
allowed document language and if the corresponding text (select, then 
Format - Character) is set to be in that language.

So to have a working spell checker for Rhaeto-Romanic you will have to 
specify in description.xml that it applies to rm_CH, see
http://opengrok.adfinis-sygroup.org/source/xref/aoo-trunk/main/i18npool/source/isolang/isolang.cxx#195

Or we'll have to find if Sursilvan has some codes that we can use and 
try to add support for them to the next version of OpenOffice. But in 
the short term you'd better edit the extension and try to apply it to rm_CH.

A tip: while developing extensions, at times things can go wrong and you 
may be left with extensions that can't be uninstalled or updated. The 
simplest (even if it's not the most elegant) way to get around it is to 
reset your user profile. See 
https://forum.openoffice.org/en/forum/viewtopic.php?p=58403

If you make any further steps, please share your oxt package again and 
we can look at it.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
For additional commands, e-mail: l10n-help@openoffice.apache.org


RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.
Thanks a lot for your tips - it looks much better know, because the name of the extension and the publishers name is shown in the extension manager. 
It doesnt look like a mandatory for the xcu to be in the folder registry. The Dutch, English and German extensions didnt pack the xcu in a subfolder. 
No, I didnt know that the language has to be available as a document language. Where can I check it? I see "Rätoromanisch" as a possible language during the import of a document to OpenOffice writer. But the spell checker I wrote is for a variety of Rätoromanisch. I cant imagine that Sursilvan is supportet. Is it a problem? Would my spell checker work even though it isnt a document language? What does it mean to be a document language?
Regards, Maria
> Date: Sun, 17 Aug 2014 22:10:38 +0200
> From: pescetti@apache.org
> To: l10n@openoffice.apache.org
> Subject: Re: Spellchecker for Rhaeto-Romanic
> 
> On 16/08/2014 Maria Hartmann wrote:
> > Okay. I am using windows 7 and OpenOffice 4.
> 
> Ok, good to know.
> 
> > https://www.dropbox.com/s/6chp8mwr4451j4q/roh_SUR.oxt
> 
> I see the most classic mistake there: Our (insane) convention is that 
> your files should NOT be inside a folder. So, your ZIP file contains the 
> "Dictionary_roh_SUR" folder, which in turn contains the dictionary 
> files. As a first step, you should fix this and the extension will 
> install cleanly.
> 
> A second issue I see is that
>                  <value>%origin%/../dictionaries/roh_SUR.aff 
> %origin%/../dictionaries/roh_SUR.dic</value>
> assumes that dictionaries are in "../dictionaries", which is true for 
> the Italian dictionary, but not for yours.
> 
> I believe the dictionaries.xcu file should be in a subfolder called 
> "registry", but I can't test at the moment if this is mandatory. Anyway, 
> the path above must be corrected accordingly (so, simply .. and not 
> ../dictionaries in case you move dictionaries.xcu in the registry/ 
> subfolder).
> 
> And then I hit a stopper for a possible issue with language codes. Have 
> you already checked that OpenOffice supports roh_SUR as a document 
> language? If you are unsure of where to check, just ask for directions.
> 
> Regards,
>    Andrea.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
> For additional commands, e-mail: l10n-help@openoffice.apache.org
> 
 		 	   		  

Re: Spellchecker for Rhaeto-Romanic

Posted by Andrea Pescetti <pe...@apache.org>.
On 16/08/2014 Maria Hartmann wrote:
> Okay. I am using windows 7 and OpenOffice 4.

Ok, good to know.

> https://www.dropbox.com/s/6chp8mwr4451j4q/roh_SUR.oxt

I see the most classic mistake there: Our (insane) convention is that 
your files should NOT be inside a folder. So, your ZIP file contains the 
"Dictionary_roh_SUR" folder, which in turn contains the dictionary 
files. As a first step, you should fix this and the extension will 
install cleanly.

A second issue I see is that
                 <value>%origin%/../dictionaries/roh_SUR.aff 
%origin%/../dictionaries/roh_SUR.dic</value>
assumes that dictionaries are in "../dictionaries", which is true for 
the Italian dictionary, but not for yours.

I believe the dictionaries.xcu file should be in a subfolder called 
"registry", but I can't test at the moment if this is mandatory. Anyway, 
the path above must be corrected accordingly (so, simply .. and not 
../dictionaries in case you move dictionaries.xcu in the registry/ 
subfolder).

And then I hit a stopper for a possible issue with language codes. Have 
you already checked that OpenOffice supports roh_SUR as a document 
language? If you are unsure of where to check, just ask for directions.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
For additional commands, e-mail: l10n-help@openoffice.apache.org


RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.
Okay. I am using windows 7 and OpenOffice 4. I think, the following link will work: 

https://www.dropbox.com/s/6chp8mwr4451j4q/roh_SUR.oxt

The readme isnt completed yet, but everything else should be okay actually...
The rules F, D, K are not used and I tried to use X, Y but it wasnt working, so they also arent used in this version. 
Sorry for the german comments. ;)

Thanks you, 
Maria

> Date: Fri, 15 Aug 2014 21:06:04 +0200
> From: pescetti@apache.org
> To: l10n@openoffice.apache.org
> Subject: Re: Spellchecker for Rhaeto-Romanic
> 
> On 15/08/2014 Maria Hartmann wrote:
> > Exactly, now
> > I have to remove redundancy. And the compounding isn’t working yet.
> 
> OK, we can look at this later. What operating system (Windows, Mac, 
> Linux) are you using, so that we can be more helpful in suggesting how 
> to configure tools?
> 
> > My extension consist of the dic, the aff, the description.xml,
> > the dictionary.xcu and a readme.txt with the license, so whats wrong?
> 
> Let us see it, at times problems are very trivial to solve but complex 
> to find. Can you put the resulting .oxt file somewhere on the Internet 
> and send us the link? You won't be able to attach it (attachments are 
> removed), but if you put it on dropbox.com or wikisend.com or any other 
> file hosting service we can have a look and try it with the latest 
> OpenOffice build.
> 
> Regards,
>    Andrea.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
> For additional commands, e-mail: l10n-help@openoffice.apache.org
> 
 		 	   		  

Re: Spellchecker for Rhaeto-Romanic

Posted by Andrea Pescetti <pe...@apache.org>.
On 15/08/2014 Maria Hartmann wrote:
> Exactly, now
> I have to remove redundancy. And the compounding isn’t working yet.

OK, we can look at this later. What operating system (Windows, Mac, 
Linux) are you using, so that we can be more helpful in suggesting how 
to configure tools?

> My extension consist of the dic, the aff, the description.xml,
> the dictionary.xcu and a readme.txt with the license, so whats wrong?

Let us see it, at times problems are very trivial to solve but complex 
to find. Can you put the resulting .oxt file somewhere on the Internet 
and send us the link? You won't be able to attach it (attachments are 
removed), but if you put it on dropbox.com or wikisend.com or any other 
file hosting service we can have a look and try it with the latest 
OpenOffice build.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
For additional commands, e-mail: l10n-help@openoffice.apache.org


RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.

The actual
situation is as follows:

I got a
wordlist with lemmas and Part-of-Speech tags and created an affix file by a
grammar. To connect the words and the rules I wrote a Java program which
replaces some PoS-tags by the rule name. Some of them were connected manually.
Yes, I can imagine, that the tool works well for maintenance, so I can use it
now. 

 

Exactly, now
I have to remove redundancy. And the compounding isn’t working yet. 

For example:

 

I have the words tschun/Zxyz, in/Zxyz and melli/xyz in the dictionary and
the rule Z in my affixfile:

SFX Z Y 26

SFX Z    0        onta/WXZxyz       [gt]        #
10

SFX Z    0        tschien/Xxyz      .
          #
100

SFX Z    0        melli/Xxyz        .           # 1000

SFX Z    0        s/V               [ai]        # dua, trei --> duas, treis

SFX Z    a        s/V               dua

SFX Z    in       endisch           in          #
11

SFX Z    a        disch/W           a           # dudisch (12)

SFX Z    i        disch/W           i           #
tredisch (13)

SFX Z    ater     itordisch/W       quater      # quitordisch (14)

SFX Z    t        disch/W           quent       #
quendisch (15)

SFX Z    is       edisch/W          sis         # sedisch (16)

SFX Z    dua      vegn/ZW   
       dua   
     # 20

SFX Z    0        conta/WXZxyz      un   
      # 50

SFX Z    0        sonta/WXZxyz   
  s       
   # 60

SFX Z    ov       avonta/WXZxyz     nov   
     # 90

SFX Z    a        in/ZXxyz          a       
   # trentin(31), curontin(41)

SFY Z    in       emprem/W   
      in   
      # 1.

SFY Z    in       prem              in          # 1.

SFX Z    dua      secund/W   
      dua   
     # 2.

SFX Z    rei      ierz/W   
        trei   
    # tierz (3.)

SFX Z    rei      erzio   
         trei   
    # terzio (3.)

SFX Z    ter      rt/W   
          quater
     # quart (4.)

SFX Z    tschun   quint/W   
       tschun   
  # quint (5.)

SFX Z    0        avel/W   
        [uiaton][nstgv]   
    # 6.,7. etc.

SFX Z    rei      iarza   
         trei   
    # tiarza (1/3)

SFX Z    ent   
  int               quent        



tschuncontamelli is correctly identified and
tschuncontainmelli (not correct) is suggested, but the right word
tschuncontinmelli isnt found. Why not? How is the compounding working? 

 

I am testing on the prompt, because I cant see
the extension in the OO Writer. I imported it via extension manager but it isn’t
available for the spell check yet. 

My extension consist of the dic, the aff, the description.xml,
the dictionary.xcu and a readme.txt with the license, so whats wrong?

 

Regards, 

Maria

 

 

> Date: Fri, 15 Aug 2014 17:44:04 +0200
> From: pescetti@apache.org
> To: l10n@openoffice.apache.org
> Subject: Re: Spellchecker for Rhaeto-Romanic
> 
> On 12/08/2014 Maria Hartmann wrote:
> > Good morning,
> > the proofing tool gui is no help, sadly.
> 
> I had understood that what you wanted to do is maintenance and checks on 
> a dictionary, so that you had already created you .dic file, your .aff 
> file and added (manually) the rules to the .dic file. This is how 
> maintenance is normally done, and it is the scenario in which the tool 
> is useful.
> 
> Are you at another stage instead? I mean, what does your .aff file look 
> like? Are you at the stage where you have a huge wordlist collected 
> randomly (so, not by hand) and you created some .aff rules and want to 
> remove redundancy (i.e., obtaining an equivalent dictionary with 
> annotated words in the .dic file)?
> 
> It will greatly help (if you are still stuck here) if you copy and paste 
> ten lines (no attachments please) from your .dic file and .aff file.
> 
> Regards,
>    Andrea.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
> For additional commands, e-mail: l10n-help@openoffice.apache.org
> 
 		 	   		  

Re: Spellchecker for Rhaeto-Romanic

Posted by Andrea Pescetti <pe...@apache.org>.
On 12/08/2014 Maria Hartmann wrote:
> Good morning,
> the proofing tool gui is no help, sadly.

I had understood that what you wanted to do is maintenance and checks on 
a dictionary, so that you had already created you .dic file, your .aff 
file and added (manually) the rules to the .dic file. This is how 
maintenance is normally done, and it is the scenario in which the tool 
is useful.

Are you at another stage instead? I mean, what does your .aff file look 
like? Are you at the stage where you have a huge wordlist collected 
randomly (so, not by hand) and you created some .aff rules and want to 
remove redundancy (i.e., obtaining an equivalent dictionary with 
annotated words in the .dic file)?

It will greatly help (if you are still stuck here) if you copy and paste 
ten lines (no attachments please) from your .dic file and .aff file.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
For additional commands, e-mail: l10n-help@openoffice.apache.org


RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.


 In the edit-a-word-window, what do I have to write in the line "tags"?  
The screenshot on the website http://marcoagpinto.cidadevirtual.pt/proofingtoolgui.html called this line "flags". 



 		 	   		  

RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.
Good morning, 
the proofing tool gui is no help, sadly. There is a possibility to connect the words and the rules manually, but thats what I try to avoid...



> Date: Tue, 12 Aug 2014 08:59:56 +0200
> From: pescetti@apache.org
> To: l10n@openoffice.apache.org
> Subject: Re: Spellchecker for Rhaeto-Romanic
> 
> On 11/08/2014 Maria Hartmann wrote:
> > And my next problem...I created an affix file for my wordlist and
> > tested single words with the command "hunspell -d mylanguage". Now I
> > have to connect my rules with the words, but the everywhere written
> > command "munch" isn't working. I alsways get the error "Can't open
> > munch" or that the command is written wrong. What's wrong? I am
> > working on windows and have the hunspell version 1.2.8. Is there
> > another way to connect written rules with the words in my
> > dic?Regards, Maria
> 
> Yes, use Marco's Proofing Tool GUI listed at
> https://wiki.openoffice.org/wiki/Dictionaries
> 
> It will allow you to open the .dic and .aff files and see derived words 
> without the need for unmunch.
> 
> Regards,
>    Andrea.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
> For additional commands, e-mail: l10n-help@openoffice.apache.org
> 
 		 	   		  

Re: Spellchecker for Rhaeto-Romanic

Posted by Andrea Pescetti <pe...@apache.org>.
On 11/08/2014 Maria Hartmann wrote:
> And my next problem...I created an affix file for my wordlist and
> tested single words with the command "hunspell -d mylanguage". Now I
> have to connect my rules with the words, but the everywhere written
> command "munch" isn't working. I alsways get the error "Can't open
> munch" or that the command is written wrong. What's wrong? I am
> working on windows and have the hunspell version 1.2.8. Is there
> another way to connect written rules with the words in my
> dic?Regards, Maria

Yes, use Marco's Proofing Tool GUI listed at
https://wiki.openoffice.org/wiki/Dictionaries

It will allow you to open the .dic and .aff files and see derived words 
without the need for unmunch.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
For additional commands, e-mail: l10n-help@openoffice.apache.org


RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.
And my next problem...I created an affix file for my wordlist and tested single words with the command "hunspell -d mylanguage". Now I have to connect my rules with the words, but the everywhere written command "munch" isn't working. I alsways get the error "Can't open munch" or that the command is written wrong. What's wrong? I am working on windows and have the hunspell version 1.2.8. Is there another way to connect written rules with the words in my dic?Regards, Maria 		 	   		  

RE: Spellchecker for Rhaeto-Romanic

Posted by Maria Hartmann <ma...@hotmail.de>.
I found it!You have to integrate FULLSTRIP in your affixfile. Afterwards it is possible to write a rule likeSFX A Y 2SFX A    be  are  beSFX A    0    en    be--> be, are, been
 		 	   		  

Re: Spellchecker for Rhaeto-Romanic

Posted by Andrea Pescetti <pe...@apache.org>.
On 05/08/2014 Maria Hartmann wrote:
> To write a spellchecker
> for Sursilvan, one of the varieties, is the topic of my bachelor
> thesis. I have the wordlist and i am writing the affixfile at the
> moment. It already works with words which are not replaced completly
> but just a few character of it. How can I replace words?

Hello Maria, OpenOffice relies on Hunspell for its spell checking 
engine, so Hunspell is the right place for asking these questions.

For sure, from the point of view of a pure spell-checker, you can add 
"am" as a separate word than "be". This is what we do in the Italian 
spell-checker http://extensions.openoffice.org/node/1204 (you will see 
separate entries for "sono" and "essere", that mean "am" and "be" 
respectively, since they have no common characters in Italian either).

This may be inconvenient when one wants to add a thesaurus though, since 
it breaks the link between "I am" and "to be".

When you are done with your dictionary, please make sure to inform us 
again, and we can assist you in packaging it as an OpenOffice extension 
and making it available at the web site linked above.

Regards,
   Andrea.

---------------------------------------------------------------------
To unsubscribe, e-mail: l10n-unsubscribe@openoffice.apache.org
For additional commands, e-mail: l10n-help@openoffice.apache.org