You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "David Smiley (Jira)" <ji...@apache.org> on 2019/10/18 14:47:08 UTC
[jira] [Commented] (SOLR-13850) Atomic Updates with
PreAnalyzedField
[ https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16954664#comment-16954664 ]
David Smiley commented on SOLR-13850:
-------------------------------------
I'm not even sure it's meaningful to have a pre-analyzed field be "stored".
> Atomic Updates with PreAnalyzedField
> ------------------------------------
>
> Key: SOLR-13850
> URL: https://issues.apache.org/jira/browse/SOLR-13850
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 7.7.2, 8.2
> Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 (Oracle)
> Reporter: Oleksandr Drapushko
> Priority: Critical
> Labels: AtomicUpdate
>
> If you try to update non pre-analyzed fields in a document using atomic updates, data in pre-analyzed fields (if there is any) will be lost.
> *Steps to reproduce*
> 1. Index this document into techproducts
> {code:json}
> {
> "id": "a",
> "n_s": "s1",
> "pre": "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}"
> }
> {code}
> 2. Query the document
> {code:json}
> {
> "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
> {
> "id":"a",
> "n_s":"s1",
> "pre":"Alaska",
> "_version_":1647475215142223872}]
> }}
> {code}
> 3. Update using atomic syntax
> {code:json}
> {
> "add": {
> "doc": {
> "id": "a",
> "n_s": {"set": "s2"}
> }}}
> {code}
> 4. Observe the warning in solr log
> UI:
> {noformat}
> WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing pre-analyzed field 'pre'
> {noformat}
> solr.log:
> {noformat}
> WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type java.lang.String, expected Map
> at org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86)
> {noformat}
> 5. Query the document again
> {code:json}
> {
> "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[
> {
> "id":"a",
> "n_s":"s2",
> "_version_":1647475461695995904}]
> }}
> {code}
> *Result*: There is no 'pre' field in the document anymore.
> _My thoughts on it_
> 1. Data loss can be prevented if the warning will be replaced with error (re-throwing exception). Atomic updates for such documents still won't work, but updates will be explicitly rejected.
> 2. Solr tries to read the document from index, merge it with input document and re-index the document, but when it reads indexed pre-analyzed fields the format is different, so Solr cannot parse and re-index those fields properly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org