You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Calvin Smith (Jira)" <ji...@apache.org> on 2024/01/19 19:42:00 UTC

[jira] [Comment Edited] (SOLR-17120) NullPointerException in UpdateLog.applyOlderUpdates in solr 6.6-9.4 involving partial updates

    [ https://issues.apache.org/jira/browse/SOLR-17120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17808405#comment-17808405 ] 

Calvin Smith edited comment on SOLR-17120 at 1/19/24 7:41 PM:
--------------------------------------------------------------

I compared the effect of

 
{code:java}
if (values == null) continue;{code}

versus calling {{addField}} with {{null}} like I had, and I wasn't able to observe any different outcome. In both cases, the old field values that were requested to be removed by the earlier updates that included 
{code:java}
{"set": null}{code}
were still processed successfully. Right after the update happened, but before the {{commitWithin}} window had passed, I would still see the old value when querying the {{/select}} endpoint, and using the\{{ /get }}endpoint, I would not see the fields that were to have been removed, whether the later updates called addField with {{null}} or not. Once the {{commitWindow}} passed, then the {{/select}} endpoint correctly showed the documents had been updated as they should have been and the fields had been removed (and other expected changes were applied).

So it seems that the newerDoc definitely doesn't need the null, and maybe it's not having any effect when it is present, as things seem to work the same in both cases as far as I can tell by inspecting the state of the documents in the index after it happens.

I'll stick with the version Christine suggested that does {{continue}} while I'm testing.


was (Author: JIRAUSER303825):
I compared the effect of
if (values == null) continue;
versus calling {{addField}} with {{null}} like I had, and I wasn't able to observe any different outcome. In both cases, the old field values that were requested to be removed by the earlier updates that included {{{"set": null}}} were still processed successfully. Right after the update happened, but before the {{commitWithin}} window had passed, I would still see the old value when querying the {{/select}} endpoint, and using the{{ /get }}endpoint, I would not see the fields that were to have been removed, whether the later updates called addField with {{null}} or not. Once the {{commitWindow}} passed, then the {{/select}} endpoint correctly showed the documents had been updated as they should have been and the fields had been removed (and other expected changes were applied).

So it seems that the newerDoc definitely doesn't need the null, and maybe it's not having any effect when it is present, as things seem to work the same in both cases as far as I can tell by inspecting the state of the documents in the index after it happens.

I'll stick with the version Christine suggested that does {{continue}} while I'm testing.

> NullPointerException in UpdateLog.applyOlderUpdates in solr 6.6-9.4 involving partial updates
> ---------------------------------------------------------------------------------------------
>
>                 Key: SOLR-17120
>                 URL: https://issues.apache.org/jira/browse/SOLR-17120
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 6.6.2, 8.11.2, 9.4
>         Environment: The issue occurred on Linux, CentOS 7.9, with the following JDK version:
> {noformat}
> openjdk version "11.0.20" 2023-07-18 LTS
> OpenJDK Runtime Environment (Red_Hat-11.0.20.0.8-1.el7_9) (build 11.0.20+8-LTS)
> OpenJDK 64-Bit Server VM (Red_Hat-11.0.20.0.8-1.el7_9) (build 11.0.20+8-LTS, mixed mode, sharing){noformat}
>            Reporter: Calvin Smith
>            Priority: Major
>
> I mailed the solr-users mailing list about this issue, but didn't get any responses there, so am creating this issue. The subject of the email thread for additional context was "NullPointerException in UpdateLog.applyOlderUpdates under solr 8&9 involving partial updates and high update load" - link: [https://lists.apache.org/thread/n9zm4gocl7cf073syy1159dy6ojjrywl]
> I'm seeing a Solr HTTP 500 error when performing a partial update of a document that turns out to triggered by there having been a recent update of the same document that included a partial update that set a field to {{{}null{}}}. I've observed the behavior in versions 6.6.2, 8.11.2, and 9.4.0, which are the only 3 versions I've tried.
> To give an example, an update doc like
>  
> {code:java}
> {
>     "id": "123", 
>     "camera_unit": {"set": null}
> }{code}
>  
> followed shortly thereafter (not sure of exact timing, but I was using a {{commitWithin}} of 600s and the subsequent updates were less than 20 seconds later), after some other updates had happened for different documents, there was another update of the same document, like
>  
> {code:java}
> {
>     "id": "123", 
>     "playlist": {
>       "set": [
>         12345
>       ]
>     },
>     "playlist_index_321": {
>       "set": 0
>     }
> }{code}
>  
> This later update may, but doesn't always, cause the {{{}NullPointerException{}}}, so there is some other factor such as the state of the {{tlog}} that also has to be satisfied for the error to occur.
> The exception is thrown by the following code in {{UpdateLog.java}} ({{{}org.apache.solr.update.UpdateLog{}}}):
>  
> {code:java}
> /** Add all fields from olderDoc into newerDoc if not already present in newerDoc */
>   private void applyOlderUpdates(
>       SolrDocumentBase<?, ?> newerDoc, SolrInputDocument olderDoc, Set<String> mergeFields) {
>     for (String fieldName : olderDoc.getFieldNames()) {
>       // if the newerDoc has this field, then this field from olderDoc can be ignored
>       if (!newerDoc.containsKey(fieldName)
>           && (mergeFields == null || mergeFields.contains(fieldName))) {
>         for (Object val : olderDoc.getFieldValues(fieldName)) {
>           newerDoc.addField(fieldName, val);
>         }
>       }
>     }
>   }{code}
>  
> The exception is due to the inner for statement trying to iterate over the {{null}} value being returned by {{{}olderDoc.getFieldValues(fieldName){}}}.
> When I change that method to the following:
>  
> {code:java}
> /** Add all fields from olderDoc into newerDoc if not already present in newerDoc */
>   private void applyOlderUpdates(
>       SolrDocumentBase<?, ?> newerDoc, SolrInputDocument olderDoc, Set<String> mergeFields) {
>     for (String fieldName : olderDoc.getFieldNames()) {
>       // if the newerDoc has this field, then this field from olderDoc can be ignored
>       if (!newerDoc.containsKey(fieldName)
>           && (mergeFields == null || mergeFields.contains(fieldName))) {
>         Collection<Object> values = olderDoc.getFieldValues(fieldName);
>         if (values == null) {
>             newerDoc.addField(fieldName, null);
>         } else {
>             for (Object val : values) {
>               newerDoc.addField(fieldName, val);
>             }
>         }
>       }
>     }
>   }{code}
>  
> Then after rebuilding the solr-core JAR with {{./gradlew devFull}} and restarting Solr with that custom jar file, I can no longer reproduce the error.
> I'm not familiar with the Solr codebase though and am not at all sure that {{newerDoc.addField(fieldName, null)}} is what should be done there.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org