You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Erick Erickson (JIRA)" <ji...@apache.org> on 2015/02/23 03:48:11 UTC

[jira] [Closed] (SOLR-7085) Add a comment to the schema.xml file(s) warning against applying analysis chains to the field.

     [ https://issues.apache.org/jira/browse/SOLR-7085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erick Erickson closed SOLR-7085.
--------------------------------
       Resolution: Fixed
    Fix Version/s: 5.1
                   Trunk

> Add a comment to the schema.xml file(s) warning against applying analysis chains to the <uniqueKey> field.
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-7085
>                 URL: https://issues.apache.org/jira/browse/SOLR-7085
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Minor
>             Fix For: Trunk, 5.1
>
>         Attachments: SOLR-7085.patch
>
>
> If you apply index-time transformations to the <uniqueKey> field, very interesting things happen, all of them bad.
> 1> the doc doesn't get updated
> 2> Docs are routed to shards based on the original form of the ID field.
> I stopped looking there. There are much bigger fish to fry than trying to apply an index-time analysis chain to the <uniqueKey> so a comment in the schema.xml seems all that is necessary.
> Trying  to change this at a code level would be a nightmare I suspect. Consider routing by a secondary field for instance and N+1 other places this would pop out.
> Limited _query_ time transformations are OK, they just have to match the indexing program's transformations, about the only one I'd recommend is lowercasing, but others are possible if you're brave as long as they match the indexing program's transformations.
> My "rule of thumb" I was trying to apply here is that "anything a human enters in your search app should not be a case-sensitive when searching" and it can be enforced easily enough.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org