You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Rohit (JIRA)" <ji...@apache.org> on 2018/04/05 11:40:00 UTC

[jira] [Comment Edited] (SOLR-12065) Restore replica always in buffering state

    [ https://issues.apache.org/jira/browse/SOLR-12065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16426778#comment-16426778 ] 

Rohit edited comment on SOLR-12065 at 4/5/18 11:39 AM:
-------------------------------------------------------

1. In AbstractCloudBackupRestoreTestCase we added a new method indexNewDocsToCollection . Can't we reuse indexDocs ? We can make changes to that method so that it's generally more reusable for your test
 - Done

2. Maybe \{[getDocCountInCollection}} can be something like this since we are counting the number of docs in a collection , so hitting any underlying node will be fine
 - Reused the existing getShardToDocCountMap function

3. RestoreCmd has a unused import. This will make {{ant precommit}} fail
 * Removed the un-used import and verified with ant precommit 
 * [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for malformed docs...

precommit:

BUILD SUCCESSFUL
 Total time: 16 minutes 30 seconds

4. Should both of the log.info lines added to RestoreCmd be debug level? Also can we use parameterized logging ?
 - Changed at one place and used parameterised logging

5. After the REQUESTAPPLYUPDATES command is sent , shouldn't we validate the response? So something like this
{code:java}
 ocmh.processResponses(new NamedList(), shardHandler, true, "REQUESTAPPLYUPDATES calls did not succeed", asyncId, requestMap);{code}
 * Done, added it to the RestoreCmd source.

Verified the build on Master and state in tlog.state for test_restore collection using the admin/metrics API on two node Solr cluster

{{SEARCHER.searcher.searcherName":"Searcher@70ed14d[test_restore_shard1_replica_n47] main",}}
 \{{ "SEARCHER.searcher.warmupTime":0,}}
 {{......}}
 \{{ "TLOG.replay.remaining.bytes":0,}}
 \{{ "TLOG.replay.remaining.logs":0,}}
 \{{ "TLOG.state":3,}}{{"SEARCHER.searcher.registeredAt":"2018-04-05T11:24:20.667Z",}}
 {{ "SEARCHER.searcher.searcherName":"Searcher@475f6bda[test_restore_shard2_replica_n45] main",}}
 \{{ "TLOG.replay.remaining.bytes":104,}}
 \{{ "TLOG.replay.remaining.logs":1,}}
 \{{ "TLOG.state":3,}}

 

[~varunthacker] Requesting you to have a look and let me know your feedback


was (Author: rohitcse):
1. In AbstractCloudBackupRestoreTestCase we added a new method indexNewDocsToCollection . Can't we reuse indexDocs ? We can make changes to that method so that it's generally more reusable for your test
- Done

2. Maybe \{[getDocCountInCollection}} can be something like this since we are counting the number of docs in a collection , so hitting any underlying node will be fine

- Reused the existing getShardToDocCountMap function

3. RestoreCmd has a unused import. This will make {{ant precommit}} fail
 * Removed the un-used import and verified with ant precommit 
 * [exec] Crawl/parse...
 [exec] 
 [exec] Verify...
 [echo] Checking for malformed docs...

precommit:

BUILD SUCCESSFUL
Total time: 16 minutes 30 seconds

4. Should both of the log.info lines added to RestoreCmd be debug level? Also can we use parameterized logging ?

- Changed at one place and used parameterised logging

5. After the REQUESTAPPLYUPDATES command is sent , shouldn't we validate the response? So something like this
{code:java}
 ocmh.processResponses(new NamedList(), shardHandler, true, "REQUESTAPPLYUPDATES calls did not succeed", asyncId, requestMap);{code}
 * Done, added it to the RestoreCmd source.

Verified the build on Master and state in tlog.state for test_restore collection using the admin/metrics API on two node Solr cluster

{{SEARCHER.searcher.searcherName":"Searcher@70ed14d[test_restore_shard1_replica_n47] main",}}
{{ "SEARCHER.searcher.warmupTime":0,}}
{{......}}
{{ "TLOG.replay.remaining.bytes":0,}}
{{ "TLOG.replay.remaining.logs":0,}}
{{ "TLOG.state":3,}}{{"SEARCHER.searcher.registeredAt":"2018-04-05T11:24:20.667Z",}}
{{ "SEARCHER.searcher.searcherName":"Searcher@475f6bda[test_restore_shard2_replica_n45] main",}}
{{ "TLOG.replay.remaining.bytes":104,}}
{{ "TLOG.replay.remaining.logs":1,}}
{{ "TLOG.state":3,}}

> Restore replica always in buffering state
> -----------------------------------------
>
>                 Key: SOLR-12065
>                 URL: https://issues.apache.org/jira/browse/SOLR-12065
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Varun Thacker
>            Assignee: Varun Thacker
>            Priority: Major
>         Attachments: 12065.patch, 12605UTLogs.txt.zip, SOLR-12065.patch, logs_and_metrics.zip, restore_snippet.log
>
>
> Steps to reproduce:
>  
>  - [http://localhost:8983/solr/admin/collections?action=CREATE&name=test_backup&numShards=1&nrtReplicas=1]
>  - curl [http://127.0.0.1:8983/solr/test_backup/update?commit=true] -H 'Content-type:application/json' -d '
>  [ \{"id" : "1"}
> ]' 
>  - [http://localhost:8983/solr/admin/collections?action=BACKUP&name=test_backup&collection=test_backup&location=/Users/varunthacker/backups]
>  - [http://localhost:8983/solr/admin/collections?action=RESTORE&name=test_backup&location=/Users/varunthacker/backups&collection=test_restore]
>  * curl [http://127.0.0.1:8983/solr/test_restore/update?commit=true] -H 'Content-type:application/json' -d '
>  [
> {"id" : "2"}
> ]'
>  * Snippet when you try adding a document
> {code:java}
> INFO - 2018-03-07 22:48:11.555; [c:test_restore s:shard1 r:core_node22 x:test_restore_shard1_replica_n21] org.apache.solr.update.processor.DistributedUpdateProcessor; Ignoring commit while not ACTIVE - state: BUFFERING replay: false
> INFO - 2018-03-07 22:48:11.556; [c:test_restore s:shard1 r:core_node22 x:test_restore_shard1_replica_n21] org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor; [test_restore_shard1_replica_n21] webapp=/solr path=/update params={commit=true}{add=[2 (1594320896973078528)],commit=} 0 4{code}
>  * If you see "TLOG.state" from [http://localhost:8983/solr/admin/metrics] it's always 1 (BUFFERING)
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org