You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Yonik Seeley (JIRA)" <ji...@apache.org> on 2015/12/18 18:46:46 UTC

[jira] [Commented] (SOLR-8412) SchemaManager should synchronize on performOperations method

    [ https://issues.apache.org/jira/browse/SOLR-8412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064306#comment-15064306 ] 

Yonik Seeley commented on SOLR-8412:
------------------------------------

bq. We should synzhronize on performOperations instead. The net affect will be the same but the code will be more clear.

Changing complex synchronization causes warning bells to go off.
Are you sure that the net effect is the same?  I'm not familiar with this part of the code, so hopefully someone else who is can chime in... but at first blush it definitely doesn't look safe.
This patch changes the locking from using schemaUpdateLock (which is shared among multiple objects) to using either schemaUpdateLock or the current objects monitor.  It's certainly not simpler or clearer to try and figure out if things are still thread safe.

Reviewing the existing code some, I see this:
- SchemaManager.performOperations() calls doOperations() protected by schemaUpdateLock
  - this performs a list of operations on the latest ManagedIndexSchema object, which *may* be created fresh, but will be passed the same schemaUpdateLock
  - these operations can call things like addFields()

AddSchemaFieldsUpdateProcessor has this:
{code}
        // Need to hold the lock during the entire attempt to ensure that
        // the schema on the request is the latest
        synchronized (oldSchema.getSchemaUpdateLock()) {
          try {
            IndexSchema newSchema = oldSchema.addFields(newFields);
{code}
But with the patch, we're locking on a different object, so what the comment asserts it is trying to do may be broken?
Actually, it's not at all clear to me why even in the current code, we don't need to grab the latest schema again *after* we lock the update lock.

Moving on, to addFields(), it looks like it can (with the patch) now be called on the same object with two different locks held.  Or even on different objects it's not clear if it's now safe.

Bottom line: the synchronization in the current code is complex enough that I don't know if the proposed simplifications are safe or not.  If you could add some explanation around that, it would be great.


> SchemaManager should synchronize on performOperations method
> ------------------------------------------------------------
>
>                 Key: SOLR-8412
>                 URL: https://issues.apache.org/jira/browse/SOLR-8412
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Varun Thacker
>            Priority: Minor
>         Attachments: SOLR-8412.patch, SOLR-8412.patch, SOLR-8412.patch
>
>
> Currently SchemaManager synchronizes on {{schema.getSchemaUpdateLock()}} . We should synzhronize on {{performOperations}} instead. 
> The net affect will be the same but the code will be more clear. {{schema.getSchemaUpdateLock()}} is used when you want to edit a schema and add one field at a time. But the way SchemaManager works is that it does bulk operations i.e performs all operations and then persists the final schema . If there were two concurrent operations that took place, the later operation will retry by fetching the latest schema .



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org