You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "David Smiley (Jira)" <ji...@apache.org> on 2019/10/18 14:47:08 UTC

[jira] [Commented] (SOLR-13850) Atomic Updates with PreAnalyzedField

    [ https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16954664#comment-16954664 ] 

David Smiley commented on SOLR-13850:
-------------------------------------

I'm not even sure it's meaningful to have a pre-analyzed field be "stored".

> Atomic Updates with PreAnalyzedField
> ------------------------------------
>
>                 Key: SOLR-13850
>                 URL: https://issues.apache.org/jira/browse/SOLR-13850
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 7.7.2, 8.2
>         Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 (Oracle)
>            Reporter: Oleksandr Drapushko
>            Priority: Critical
>              Labels: AtomicUpdate
>
> If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost.
> *Steps to reproduce*
> 1. Index this document into techproducts
> {code:json}
> {
>   "id": "a",
>   "n_s": "s1",
>   "pre": "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
> }
> {code}
> 2. Query the document
> {code:json}
> {
>   "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
>     {
>       "id":"a",
>       "n_s":"s1",
>       "pre":"Alaska",
>       "_version_":1647475215142223872}]
> }}
> {code}
> 3. Update using atomic syntax
> {code:json}
> {
>   "add": {
>     "doc": {
>       "id": "a",
>       "n_s": {"set": "s2"}
> }}}
> {code}
> 4. Observe the warning in solr log
> UI:
> {noformat}
>  WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre'
> {noformat}
> solr.log:
> {noformat}
> WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map
>  at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)
> {noformat}
> 5. Query the document again
> {code:json}
> {
>   "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
>     {
>       "id":"a",
>       "n_s":"s2",
>       "_version_":1647475461695995904}]
> }}
> {code}
> *Result*: There is no 'pre' field in the document anymore.
> _My thoughts on it_
> 1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected.
> 2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org