You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Yonik Seeley (Jira)" <ji...@apache.org> on 2019/10/02 18:24:00 UTC

[jira] [Commented] (SOLR-13813) Shared storage online split support

    [ https://issues.apache.org/jira/browse/SOLR-13813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943057#comment-16943057 ] 

Yonik Seeley commented on SOLR-13813:
-------------------------------------

This PR ( https://github.com/apache/lucene-solr/pull/918 ) adds a simple test.
It usually fails for shared storage :-( . Example:
{code}
java.lang.AssertionError: 
Expected :50
Actual   :49
{code}
And I normally see background exceptions like the following during the run:
{code}
Error occured pulling shard=shard1_1 collection=livesplit1 from shared store java.lang.Exception: Local Directory content /private/var/folders/_f/2q_bxy9d0kz45_rk451rds3n9r_x9g/T/solr.store.blob.SharedStorageSplitTest_E4343DDDB931B9AE-001/tempDir-001/node1/./livesplit1_shard1_1_replica_s4/data/index/ has changed since Blob pull started. Aborting pull.
	at org.apache.solr.store.blob.util.BlobStoreUtils.syncLocalCoreWithSharedStore(BlobStoreUtils.java:128)
	at org.apache.solr.update.processor.DistributedZkUpdateProcessor.readFromSharedStoreIfNecessary(DistributedZkUpdateProcessor.java:1096)
	at org.apache.solr.update.processor.DistributedZkUpdateProcessor.processCommit(DistributedZkUpdateProcessor.java:202)
	at org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:160)
	at org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
	at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62)
	at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:200)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:2609)
	at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:816)

{code}

Best guess is that this is caused by lack of concurrency support that needs to still be addressed in the blob puller/pusher code.

> Shared storage online split support
> -----------------------------------
>
>                 Key: SOLR-13813
>                 URL: https://issues.apache.org/jira/browse/SOLR-13813
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Yonik Seeley
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The strategy for online shard splitting is the same as that for normal (non SHARED shards.)
> During a split, the leader will forward updates to sub-shard leaders, those updates will be buffered by the transaction log while the split is in progress, and then the buffered updates are replayed.
> One change that was added was to push the local index to blob store after buffered updates are applied (but before it is marked as ACTIVE):
> See https://github.com/apache/lucene-solr/commit/fe17c813f5fe6773c0527f639b9e5c598b98c7d4#diff-081b7c2242d674bb175b41b6afc21663
> This issue is about adding tests and ensuring that online shard splitting (while updates are flowing) works reliably.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org