You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Jan Høydahl (Jira)" <ji...@apache.org> on 2021/08/19 14:34:00 UTC

[jira] [Commented] (SOLR-14845) Backup failing with solr 7.7.2 java.io.IOException: Interrupted system call

    [ https://issues.apache.org/jira/browse/SOLR-14845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401710#comment-17401710 ] 

Jan Høydahl commented on SOLR-14845:
------------------------------------

Jeff, do you still have this problem? Does it happen every time, i.e. is reproducible? Does it still occur on a Solr 8.x cluster? If you are able to reproduce, please also copy relevant ERROR sections of solr.log file on the server, which may bring more light on what happened.

Most likely, there was some low-level I/O issues between your Server and the disk system while writing the backup for one of the cores. Which would of course be more likely the longer running the backup job is. And as such not a Solr bug, although you could wish for more retry logic or something?

I'll close this as not reproducible if no complaints.

> Backup failing with solr 7.7.2 java.io.IOException: Interrupted system call
> ---------------------------------------------------------------------------
>
>                 Key: SOLR-14845
>                 URL: https://issues.apache.org/jira/browse/SOLR-14845
>             Project: Solr
>          Issue Type: Bug
>          Components: Backup/Restore
>    Affects Versions: 7.2.2
>            Reporter: Jeff
>            Priority: Critical
>
> I have a 12 node solrcloud cluster with 48 shards.
> 800GB on each node.
> 7.3 million docs and around 98 GB per shard.
>  
>  
> When I issue the backup command it runs for several hours and produces most of the backup but fails on some shards.
>  
> Command issued
> curl -XPOST 'http://xx.xxx.xxx.xxx:8983/solr/admin/collections?action=BACKUP&name=prod1&collection=PROD&location=/mnt/prodstorage/backup&async=111113&wt=xml'
>  
>     "Response":"TaskId: 1111127375391376590965 webapp=null path=/admin/cores params={core=PROD_shard8_1_replica_n156&async=1111127375391376590965&qt=/admin/cores&name=shard8_1&action=BACKUPCORE&location=file:///mnt/prodstorage/backup/prod1&wt=javabin&version=2} status=0 QTime=0"},
>   "1111127375391376904569":{
>     "responseHeader":{
>       "status":0,
>       "QTime":0},
>     "STATUS":"failed",
>     "Response":"Failed to backup core=PROD_shard19_1_replica_n263 because java.io.IOException: Interrupted system call"},
>   "status":{
>     "state":"failed",    "msg":"found [111112] in failed tasks"}}
>  
>  
> Can I lengthen the timeout? Manaully backup?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org