You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "pbackup12345 (via GitHub)" <gi...@apache.org> on 2023/04/08 15:36:00 UTC

[GitHub] [solr-operator] pbackup12345 opened a new issue, #547: backup to GCS and S3 never completes

pbackup12345 opened a new issue, #547:
URL: https://github.com/apache/solr-operator/issues/547

   I have a GKE autoPilot pod with a relatively vanilla setup with tlsTermination at Ingress and 3 pods. Everything works except for the backup feature.
   
   Following are the relevant parts of my setup:
   
   Main yaml on S3 location:
   ```
     backupRepositories:
       - name: "gcs-backups-1"
         gcs:
           bucket: "backupbx"
           gcsCredentialSecret:
             name: "gcssecret1"
             key: "service-account-key.json"
           baseLocation: "d"
       - name: "s3-backups-1"
         s3:
           region: "us-east-1" # Required
           bucket: "solr-xsearch" # Required
           credentials:
             accessKeyIdSecret: # Optional
               name: aws-secrets
               key: access-key-id
             secretAccessKeySecret: # Optional
               name: aws-secrets
               key: secret-access-key
   ```
   My backup yaml:
   ```
   apiVersion: solr.apache.org/v1beta1
   kind: SolrBackup
   metadata:
     name: local-backup14
     namespace: sop030
   spec:
     repositoryName: "s3-backups-1"
     solrCloud: explore
     collections:
       - dsearch
   ```
   The backup actually starts and both location (S3 and GCS) receives files as well, but after a while the backup process stops.
   There're no solr error messages, but this is the relevant portion of the logs of the pod which does the backup:
   ```
   2023-04-08 14:42:00.734 INFO  (parallelCoreAdminExecutor-19-thread-3-processing-n:explore-solrcloud-1.explore-solrcloud-headless.sop030:8983_solr x:dsearch_shard1_replica_n1 local-backup14-dsearch8519612781285 BACKUPCORE) [c:dsearch s:shard1 r:core_node2 x:dsearch_shard1_replica_n1] o.a.s.h.IncrementalShardBackup Done creating backup snapshot at s3:/local-backup14-dsearch/dsearch/ shardBackupMetadataFile:org.apache.solr.core.backup.ShardBackupId@46768e0a
   2023-04-08 14:42:01.396 INFO  (qtp1306246648-19) [   ] o.a.s.h.a.CoreAdminOperation Checking request status for : local-backup14-dsearch8519612781285
   2023-04-08 14:42:01.396 INFO  (qtp1306246648-19) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/cores params={qt=/admin/cores&requestid=local-backup14-dsearch8519612781285&action=REQUESTSTATUS&wt=javabin&version=2} status=0 QTime=0
   2023-04-08 14:42:01.398 INFO  (OverseerThreadFactory-29-thread-5-processing-n:explore-solrcloud-1.explore-solrcloud-headless.sop030:8983_solr) [c:dsearch   ] o.a.s.c.a.c.BackupCmd Starting to backup ZK data for backupName=local-backup14-dsearch
   2023-04-08 14:42:04.358 INFO  (qtp1306246648-18) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=4
   2023-04-08 14:42:04.358 INFO  (qtp1306246648-21) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=4
   2023-04-08 14:42:09.357 INFO  (qtp1306246648-20) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=3
   2023-04-08 14:42:14.358 INFO  (qtp1306246648-18) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=3
   2023-04-08 14:42:14.361 INFO  (qtp1306246648-21) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=5
   2023-04-08 14:42:16.585 INFO  (OverseerThreadFactory-29-thread-5-processing-n:explore-solrcloud-1.explore-solrcloud-headless.sop030:8983_solr) [c:dsearch   ] o.a.s.c.a.c.BackupCmd Completed backing up ZK data for backupName=local-backup14-dsearch
   2023-04-08 14:42:16.655 INFO  (OverseerThreadFactory-29-thread-5-processing-n:explore-solrcloud-1.explore-solrcloud-headless.sop030:8983_solr) [c:dsearch   ] o.a.s.c.OverseerTaskQueue Response ZK path: /overseer/collection-queue-work/qnr-0000000043 doesn't exist. Requestor may have disconnected from ZooKeeper
   2023-04-08 14:42:18.400 INFO  (OverseerCollectionConfigSetProcessor-144115198735482887-explore-solrcloud-1.explore-solrcloud-headless.sop030:8983_solr-n_0000000002) [   ] o.a.s.c.OverseerTaskQueue Response ZK path: /overseer/collection-queue-work/qnr-0000000043 doesn't exist. Requestor may have disconnected from ZooKeeper
   2023-04-08 14:42:19.358 INFO  (qtp1306246648-20) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=3
   2023-04-08 14:42:24.359 INFO  (qtp1306246648-18) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=4
   2023-04-08 14:42:24.360 INFO  (qtp1306246648-22) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=3
   2023-04-08 14:42:29.359 INFO  (qtp1306246648-20) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=4
   2023-04-08 14:42:34.360 INFO  (qtp1306246648-18) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=6
   2023-04-08 14:42:34.360 INFO  (qtp1306246648-24) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=6
   2023-04-08 14:42:39.358 INFO  (qtp1306246648-21) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/info/system params={} status=0 QTime=4
   2023-04-08 14:42:42.779 INFO  (qtp1306246648-18) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections params={requestid=local-backup14-dsearch&action=REQUESTSTATUS&wt=json} status=0 QTime=3
   2023-04-08 14:42:42.797 INFO  (qtp1306246648-22) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections params={requestid=local-backup14-dsearch&action=REQUESTSTATUS&wt=json} status=0 QTime=2
   ```
   
   And this is where it seems to die:
   ```
   2023-04-08 14:42:18.400 INFO  (OverseerCollectionConfigSetProcessor-144115198735482887-explore-solrcloud-1.explore-solrcloud-headless.sop030:8983_solr-n_0000000002) [   ] o.a.s.c.OverseerTaskQueue Response ZK path: /overseer/collection-queue-work/qnr-0000000043 doesn't exist. Requestor may have disconnected from ZooKeeper
   ```
   From there the log only has endless calls to check the requeststatus with no results:
   ```
   2023-04-08 14:43:27.884 INFO  (qtp1306246648-23) [   ] o.a.s.s.HttpSolrCall [admin] webapp=null path=/admin/collections params={requestid=local-backup14-dsearch&action=REQUESTSTATUS&wt=json} status=0 QTime=2
   ```
   Additionally through normal api calls the backup and restore function works perfectly. Specifically the following runs without a hitch:
   ```
   https://<solr-server>/solr/admin/collections?action=BACKUP&name=local-backup15&collection=dsearch&repository=s3-backups-1&location=/
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr-operator] cesarfm commented on issue #547: backup to GCS and S3 never completes

Posted by "cesarfm (via GitHub)" <gi...@apache.org>.
cesarfm commented on issue #547:
URL: https://github.com/apache/solr-operator/issues/547#issuecomment-1543802548

   Hello all. I am observing exactly the same symptoms described above.
   
   Moreover, I see something that might be of interest for the investigation. 
   - Let's say I have a GCS bucket called `mybucket`. After trying to run a backup, the files are created in the bucket as described by the OP. 
   - my SolrCloud is called `mycloud-solr`
   - my SolrBackup resource is called `mycloud-backup`, and I configured it with `location: mycloud-solr` (the location matches with the SolrCloud name, this is just a preference). So it's defined like:
   ```yaml
   apiVersion: solr.apache.org/v1beta1
   kind: SolrBackup
   metadata:
     name: mycloud-backup
     namespace: search
   spec:
     repositoryName: "gcs-backups"
     solrCloud: mycloud-solr
     location: "mycloud-solr"
     collections:
       - articles
   ```
   - the collection being backed up is called `articles`
   - then, I see that the actual backup files have the path `mycloud-solr/mycloud-backup-articles/articles` (so the path seems to be `{location}/{solrBackupName}-{collection}/{collection}`
   - Here is the thing: if I then connect to Solr and request:
   `http://localhost:8983/solr/admin/collections?action=LISTBACKUP&name=mycloud-backup-articles&location=mycloud-solr&repository=gcs-backups&gcsBucket=mybucket`, I do get the backups I created listed there, even if the SolrBackup CRD still reports itself as "in progress".
     - note that the `name=mycloud-backup-articles` corresponds to the path name directly under the location
   ```
   {
       "responseHeader": {
           "status": 0,
           "QTime": 347
       },
       "collection": "articles",
       "backups": [
           {
               "indexFileCount": 122,
               "indexSizeMB": 0.748,
               "shardBackupIds": {
                   "shard1": "md_shard1_0.json"
               },
               "collection.configName": "articles",
               "backupId": 0,
               "collectionAlias": "articles",
               "startTime": "2023-05-11T10:22:25.339652045Z",
               "indexVersion": "9.4.2",
               "endTime": "2023-05-11T10:22:42.946120826Z"
           },
           {
               "indexFileCount": 122,
               "indexSizeMB": 0.748,
               "shardBackupIds": {
                   "shard1": "md_shard1_1.json"
               },
               "collection.configName": "articles",
               "backupId": 1,
               "collectionAlias": "articles",
               "startTime": "2023-05-11T10:35:46.024110473Z",
               "indexVersion": "9.4.2",
               "endTime": "2023-05-11T10:35:50.499108324Z"
           }
       ]
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr-operator] HoustonPutman commented on issue #547: backup to GCS and S3 never completes

Posted by "HoustonPutman (via GitHub)" <gi...@apache.org>.
HoustonPutman commented on issue #547:
URL: https://github.com/apache/solr-operator/issues/547#issuecomment-1507110018

   Thanks for providing so much information!
   
   Does this happen after a few backups complete, or does it happen from the first backup?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr-operator] pbackup12345 commented on issue #547: backup to GCS and S3 never completes

Posted by "pbackup12345 (via GitHub)" <gi...@apache.org>.
pbackup12345 commented on issue #547:
URL: https://github.com/apache/solr-operator/issues/547#issuecomment-1508050565

   It happens after first installation, when I created a collection and added a very few records. And it happens repeatedly. I.e. I started from scratch several times and bumped into the same problem with a default autoPilot GKE cluster.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org


[GitHub] [solr-operator] HoustonPutman commented on issue #547: backup to GCS and S3 never completes

Posted by "HoustonPutman (via GitHub)" <gi...@apache.org>.
HoustonPutman commented on issue #547:
URL: https://github.com/apache/solr-operator/issues/547#issuecomment-1509349115

   So at this line, it should have finished almost all of the backup logic and the files should exist in s3/gcs. The line after that will delete any old incremental backups if necessary (which it shouldn't be, as its the first backup...).
   
   ```
   2023-04-08 14:42:16.585 INFO  (OverseerThreadFactory-29-thread-5-processing-n:explore-solrcloud-1.explore-solrcloud-headless.sop030:8983_solr) [c:dsearch   ] o.a.s.c.a.c.BackupCmd Completed backing up ZK data for backupName=local-backup14-dsearch
   ```
   
   Also before that error can happen, it should save the state of the completed task. So whenever the requeststatus request is sent, it should have information to populate the result.
   
   > From there the log only has endless calls to check the requeststatus with no results:
   
   Can you show me what this means? What does the "no results" json look like?
   
   Have you tried running an async backup yourself?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org