You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Colvin Cowie (Jira)" <ji...@apache.org> on 2019/11/23 17:02:00 UTC
[jira] [Updated] (SOLR-13963) JavaBinCodec has concurrent
modification of CharrArr resulting in corrupt intranode updates
[ https://issues.apache.org/jira/browse/SOLR-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colvin Cowie updated SOLR-13963:
--------------------------------
Description:
Discussed on the mailing list "Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?"
In summary, after moving to 8.3 we had a consistent (but non-deterministic) set of failing tests where the data being sent in intranode requests was _sometimes_ corrupted. For example if the well formed data was
_'fieldName':"this is a long string"_
The error we saw from Solr might be that
unknown field _+'fieldNamis a long string"+_
The change that indirectly caused to this issue to materialize was from SOLR-13682 which meant that org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call org.apache.solr.common.SolrInputField.getValue() rather than org.apache.solr.common.SolrInputField.getRawValue() as it had before.
getRawValue for a string calls org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this context calls
org.apache.solr.common.util.JavaBinCodec.getStringProvider()
JavaBinCodec has a CharArr, _arr_, which is modified in two different locations, but only one of which is protected with a synchronized block
getStringProvider() synchronizes on _arr_:
[https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966]
but _readStr() doesn't:
[https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930]
The two methods are called concurrently, but wheren't prior to SOLR-13682.
Adding a synchronized block into _readStr() around the modification of _arr_ fixes the problem as far as I can see.
Also, the problem does not seem to occur when using the dynamic schema mode of autoCreateFields=true in the updateRequestProcessorChain.
was:
Discussed on the mailing list "Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?"
In summary, after moving to 8.3 we had a consistent (but non-deterministic) set of failing tests where the data being sent in intranode requests was _sometimes_ corrupted. For example if the well formed data was
_'fieldName':"this is a long string"_
The error we saw from Solr might be that
unknown field _+'fieldNamis a long string"+_
The change that indirectly caused to this issue to materialize was from SOLR-13682 which meant that org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call org.apache.solr.common.SolrInputField.getValue() rather than org.apache.solr.common.SolrInputField.getRawValue() as it had before.
getRawValue for a string calls org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this context calls
org.apache.solr.common.util.JavaBinCodec.getStringProvider()
JavaBinCodec has a CharArr, _arr_, which is modified in two different locations, but only one of which is protected with a synchronized block
getStringProvider() synchronizes on _arr_:
[https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966]
but _readStr() doesn't:
[https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930]
The two methods are called concurrently, but wheren't prior to SOLR-13682.
Adding a synchronized block into _readStr() around the modification of _arr_ fixes the problem as far as I can see.
> JavaBinCodec has concurrent modification of CharrArr resulting in corrupt intranode updates
> -------------------------------------------------------------------------------------------
>
> Key: SOLR-13963
> URL: https://issues.apache.org/jira/browse/SOLR-13963
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 8.3
> Reporter: Colvin Cowie
> Priority: Major
>
> Discussed on the mailing list "Possible data corruption in JavaBinCodec in Solr 8.3 during distributed update?"
>
> In summary, after moving to 8.3 we had a consistent (but non-deterministic) set of failing tests where the data being sent in intranode requests was _sometimes_ corrupted. For example if the well formed data was
> _'fieldName':"this is a long string"_
> The error we saw from Solr might be that
> unknown field _+'fieldNamis a long string"+_
>
> The change that indirectly caused to this issue to materialize was from SOLR-13682 which meant that org.apache.solr.common.SolrInputDocument.writeMap(EntryWriter) would call org.apache.solr.common.SolrInputField.getValue() rather than org.apache.solr.common.SolrInputField.getRawValue() as it had before.
>
> getRawValue for a string calls org.apache.solr.common.util.ByteArrayUtf8CharSequence._getStr() which in this context calls
> org.apache.solr.common.util.JavaBinCodec.getStringProvider()
>
> JavaBinCodec has a CharArr, _arr_, which is modified in two different locations, but only one of which is protected with a synchronized block
>
> getStringProvider() synchronizes on _arr_:
> [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L966]
>
> but _readStr() doesn't:
> [https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/common/util/JavaBinCodec.java#L930]
>
> The two methods are called concurrently, but wheren't prior to SOLR-13682.
>
> Adding a synchronized block into _readStr() around the modification of _arr_ fixes the problem as far as I can see.
>
> Also, the problem does not seem to occur when using the dynamic schema mode of autoCreateFields=true in the updateRequestProcessorChain.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org