You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2014/03/04 19:03:23 UTC

[jira] [Resolved] (SOLR-2934) Problem with Solr Hunspell with French Dictionary

     [ https://issues.apache.org/jira/browse/SOLR-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved SOLR-2934.
-------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 4.7)
                   5.0
                   4.8

Currently we can load all the openoffice dictionaries (at least from the old link).

I will test newer dictionaries (thunderbird has a link) later today, especially since it has many that arent in the openoffice list. This might reveal some issues to fix.

As far as EN_GB.aff,.dic, i committed a fix for this (we use mark/reset to go back once we find the encoding, for now, and i ensured it has a large enough buffer size).

As far as the original exception reported by the user (mixing EN_GB.dic with french affix file, this is not supported. Affix files must "go with" the dictionary as they contain information such as how characters and flags are encoded).

As far as Stephen's issue: with long flags, there should never be an odd number of flags. So something is wrong with the dictionary you are using. I haven't seen it yet in the wild with published dictionaries.


> Problem with Solr Hunspell with French Dictionary
> -------------------------------------------------
>
>                 Key: SOLR-2934
>                 URL: https://issues.apache.org/jira/browse/SOLR-2934
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 3.5
>         Environment: Windows 7
>            Reporter: Nathan Castelein
>            Assignee: Chris Male
>             Fix For: 4.8, 5.0
>
>         Attachments: en_GB.aff, en_GB.dic
>
>
> I'm trying to add the HunspellStemFilterFactory to my Solr project. 
> I'm trying this on a fresh new download of Solr 3.5.
> I downloaded french dictionary here (found it from here): http://www.dicollecte.org/download/fr/hunspell-fr-moderne-v4.3.zip
> But when I start Solr and go to the Solr Analysis, an error occurs in Solr.
> Is there the trace : 
> java.lang.RuntimeException: Unable to load hunspell data! [dictionary=en_GB.dic,affix=fr-moderne.aff]
> 	at org.apache.solr.analysis.HunspellStemFilterFactory.inform(HunspellStemFilterFactory.java:82)
> 	at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:546)
> 	at org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:126)
> 	at org.apache.solr.core.CoreContainer.create(CoreContainer.java:461)
> 	at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
> 	at org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
> 	at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
> 	at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
> 	at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
> 	at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> 	at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
> 	at org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
> 	at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
> 	at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
> 	at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
> 	at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> 	at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
> 	at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
> 	at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> 	at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
> 	at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> 	at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
> 	at org.mortbay.jetty.Server.doStart(Server.java:224)
> 	at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
> 	at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> 	at java.lang.reflect.Method.invoke(Unknown Source)
> 	at org.mortbay.start.Main.invokeMain(Main.java:194)
> 	at org.mortbay.start.Main.start(Main.java:534)
> 	at org.mortbay.start.Main.start(Main.java:441)
> 	at org.mortbay.start.Main.main(Main.java:119)
> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: 3
> 	at java.lang.String.charAt(Unknown Source)
> 	at org.apache.lucene.analysis.hunspell.HunspellDictionary$DoubleASCIIFlagParsingStrategy.parseFlags(HunspellDictionary.java:382)
> 	at org.apache.lucene.analysis.hunspell.HunspellDictionary.parseAffix(HunspellDictionary.java:165)
> 	at org.apache.lucene.analysis.hunspell.HunspellDictionary.readAffixFile(HunspellDictionary.java:121)
> 	at org.apache.lucene.analysis.hunspell.HunspellDictionary.<init>(HunspellDictionary.java:64)
> 	at org.apache.solr.analysis.HunspellStemFilterFactory.inform(HunspellStemFilterFactory.java:46)
> I can't find where the problem is. It seems like my dictionary isn't well written for hunspell, but I tried with two different dictionaries, and I had the same problem.
> I also tried with an english dictionary, and ... it works !
> So I think that my french dictionary is wrong for hunspell, but I don't know why ...
> Can you help me ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org