You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Müller, Stephan <Mu...@ponton-consulting.de> on 2013/11/29 12:25:21 UTC

Patch proposal - LanguageIdentifierUpdateProcessor uses only firstValue() on multivalued fields

Hello list.

After discussing the thread "LanguageIdentifierUpdateProcessor uses only firstValue() on multivalued fields" on solr-user,
I like to propose a patch to add the following feature:

LanguageIdentifierUpdateProcessor should use all (String) values of a multivalued field for language detection.

By now, the LUP imlicitely only retieves the first-value of a multivalued field.
This leads to omitting any other values of such field. Furthermore, if for some reason, the first-value is not a String but following values would be Strings, there's no language detection at all for such a multi-valued field.

I propose this patch here, following your contribution guidelines. 
It is unclear to me if this scenario was just overlooked or if this was a conscious design decission.

So, let me hear what you think of this feature. Maybe you are already working on it.
If not, I'm eager to file my (probably first) feature request and patch on JIRA. 
I have a working trunk checkout in IDEA setup on OSX and "ant clean install" claims "SUCCESS".


Looking forward to hear from you!

Regards,
Stephan - srm

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

Re: Patch proposal - LanguageIdentifierUpdateProcessor uses only firstValue() on multivalued fields

Posted by Shalin Shekhar Mangar <sh...@gmail.com>.

Please feel free to open an issue. A patch would be great!

On Fri, Nov 29, 2013 at 4:55 PM, Müller, Stephan
<Mu...@ponton-consulting.de> wrote:
> Hello list.
>
> After discussing the thread "LanguageIdentifierUpdateProcessor uses only firstValue() on multivalued fields" on solr-user,
> I like to propose a patch to add the following feature:
>
> LanguageIdentifierUpdateProcessor should use all (String) values of a multivalued field for language detection.
>
> By now, the LUP imlicitely only retieves the first-value of a multivalued field.
> This leads to omitting any other values of such field. Furthermore, if for some reason, the first-value is not a String but following values would be Strings, there's no language detection at all for such a multi-valued field.
>
> I propose this patch here, following your contribution guidelines.
> It is unclear to me if this scenario was just overlooked or if this was a conscious design decission.
>
> So, let me hear what you think of this feature. Maybe you are already working on it.
> If not, I'm eager to file my (probably first) feature request and patch on JIRA.
> I have a working trunk checkout in IDEA setup on OSX and "ant clean install" claims "SUCCESS".
>
>
> Looking forward to hear from you!
>
> Regards,
> Stephan - srm
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>



-- 
Regards,
Shalin Shekhar Mangar.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org