You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Damien Kamerman <da...@gmail.com> on 2017/06/26 05:28:50 UTC
async backup
I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async
command returning early. The state is finished well before one shard is
finished.
The collection I'm backing up has 12 shards across 6 nodes and I suspect
the issue is that it is not waiting for all backups on the node to finish.
Alternatively, I if I change the request to not be async it works OK but
sometimes I get the exception "backup the collection time out:180s".
Has anyone seen this, or knows a workaround?
Cheers,
Damien.
Re: async backup
Posted by Damien Kamerman <da...@gmail.com>.
yes. Requeststatus is returning state=completed prematurely.
On Tuesday, 27 June 2017, Amrit Sarkar <sa...@gmail.com> wrote:
> Damien,
>
> then I poll with REQUESTSTATUS
>
>
> REQUESTSTATUS is an API which provided you the status of the any API
> (including other heavy duty apis like SPLITSHARD or CREATECOLLECTION)
> associated with async_id at that current timestamp / moment. Does that give
> you "state"="completed"?
>
> Amrit Sarkar
> Search Engineer
> Lucidworks, Inc.
> 415-589-9269
> www.lucidworks.com
> Twitter http://twitter.com/lucidworks
> LinkedIn: https://www.linkedin.com/in/sarkaramrit2
>
> On Tue, Jun 27, 2017 at 5:25 AM, Damien Kamerman <damienk@gmail.com
> <javascript:;>> wrote:
>
> > A regular backup creates the files in this order:
> > drwxr-xr-x 2 root root 63 Jun 27 09:46 snapshot.shard7
> > drwxr-xr-x 2 root root 159 Jun 27 09:46 snapshot.shard8
> > drwxr-xr-x 2 root root 135 Jun 27 09:46 snapshot.shard1
> > drwxr-xr-x 2 root root 178 Jun 27 09:46 snapshot.shard3
> > drwxr-xr-x 2 root root 210 Jun 27 09:46 snapshot.shard11
> > drwxr-xr-x 2 root root 218 Jun 27 09:46 snapshot.shard9
> > drwxr-xr-x 2 root root 180 Jun 27 09:46 snapshot.shard2
> > drwxr-xr-x 2 root root 164 Jun 27 09:47 snapshot.shard5
> > drwxr-xr-x 2 root root 252 Jun 27 09:47 snapshot.shard6
> > drwxr-xr-x 2 root root 103 Jun 27 09:47 snapshot.shard12
> > drwxr-xr-x 2 root root 135 Jun 27 09:47 snapshot.shard4
> > drwxr-xr-x 2 root root 119 Jun 27 09:47 snapshot.shard10
> > drwxr-xr-x 3 root root 4 Jun 27 09:47 zk_backup
> > -rw-r--r-- 1 root root 185 Jun 27 09:47 backup.properties
> >
> > While an async backup creates files in this order:
> > drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard3
> > drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard9
> > drwxr-xr-x 2 root root 62 Jun 27 09:49 snapshot.shard6
> > drwxr-xr-x 2 root root 37 Jun 27 09:49 snapshot.shard2
> > drwxr-xr-x 2 root root 67 Jun 27 09:49 snapshot.shard7
> > drwxr-xr-x 2 root root 75 Jun 27 09:49 snapshot.shard5
> > drwxr-xr-x 2 root root 70 Jun 27 09:49 snapshot.shard8
> > drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard4
> > drwxr-xr-x 2 root root 15 Jun 27 09:50 snapshot.shard11
> > drwxr-xr-x 2 root root 127 Jun 27 09:50 snapshot.shard1
> > drwxr-xr-x 2 root root 116 Jun 27 09:50 snapshot.shard12
> > drwxr-xr-x 3 root root 4 Jun 27 09:50 zk_backup
> > -rw-r--r-- 1 root root 185 Jun 27 09:50 backup.properties
> > drwxr-xr-x 2 root root 25 Jun 27 09:51 snapshot.shard10
> >
> >
> > shard10 is much larger than the other shards.
> >
> > From the logs:
> > INFO - 2017-06-27 09:50:33.832; [ ] org.apache.solr.cloud.BackupCmd;
> > Completed backing up ZK data for backupName=collection1
> > INFO - 2017-06-27 09:50:33.800; [ ]
> > org.apache.solr.handler.admin.CoreAdminOperation; Checking request
> status
> > for : backup1103459705035055
> > INFO - 2017-06-27 09:50:33.800; [ ]
> > org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null
> > path=/admin/cores
> > params={qt=/admin/cores&requestid=backup1103459705035055&action=
> > REQUESTSTATUS&wt=javabin&version=2}
> > status=0 QTime=0
> > INFO - 2017-06-27 09:51:33.405; [ ] org.apache.solr.handler.
> > SnapShooter;
> > Done creating backup snapshot: shard10 at file:///online/backup/
> > collection1
> >
> > Has anyone seen this bug, or knows a workaround?
> >
> >
> > On 27 June 2017 at 09:47, Damien Kamerman <damienk@gmail.com
> <javascript:;>> wrote:
> >
> > > Yes, the async command returns, and then I poll with REQUESTSTATUS.
> > >
> > > On 27 June 2017 at 01:24, Varun Thacker <varun@vthacker.in
> <javascript:;>> wrote:
> > >
> > >> Hi Damien,
> > >>
> > >> A backup command with async is supposed to return early. It is start
> the
> > >> backup process and return.
> > >>
> > >> Are you using the REQUESTSTATUS (
> > >> http://lucene.apache.org/solr/guide/6_6/collections-api.html
> > >> #collections-api
> > >> ) API to validate if the backup is complete?
> > >>
> > >> On Sun, Jun 25, 2017 at 10:28 PM, Damien Kamerman <damienk@gmail.com
> <javascript:;>>
> > >> wrote:
> > >>
> > >> > I've noticed an issue with the Solr 6.5.1 Collections API BACKUP
> async
> > >> > command returning early. The state is finished well before one shard
> > is
> > >> > finished.
> > >> >
> > >> > The collection I'm backing up has 12 shards across 6 nodes and I
> > suspect
> > >> > the issue is that it is not waiting for all backups on the node to
> > >> finish.
> > >> >
> > >> > Alternatively, I if I change the request to not be async it works OK
> > but
> > >> > sometimes I get the exception "backup the collection time out:180s".
> > >> >
> > >> > Has anyone seen this, or knows a workaround?
> > >> >
> > >> > Cheers,
> > >> > Damien.
> > >> >
> > >>
> > >
> > >
> >
>
Re: async backup
Posted by Amrit Sarkar <sa...@gmail.com>.
Damien,
then I poll with REQUESTSTATUS
REQUESTSTATUS is an API which provided you the status of the any API
(including other heavy duty apis like SPLITSHARD or CREATECOLLECTION)
associated with async_id at that current timestamp / moment. Does that give
you "state"="completed"?
Amrit Sarkar
Search Engineer
Lucidworks, Inc.
415-589-9269
www.lucidworks.com
Twitter http://twitter.com/lucidworks
LinkedIn: https://www.linkedin.com/in/sarkaramrit2
On Tue, Jun 27, 2017 at 5:25 AM, Damien Kamerman <da...@gmail.com> wrote:
> A regular backup creates the files in this order:
> drwxr-xr-x 2 root root 63 Jun 27 09:46 snapshot.shard7
> drwxr-xr-x 2 root root 159 Jun 27 09:46 snapshot.shard8
> drwxr-xr-x 2 root root 135 Jun 27 09:46 snapshot.shard1
> drwxr-xr-x 2 root root 178 Jun 27 09:46 snapshot.shard3
> drwxr-xr-x 2 root root 210 Jun 27 09:46 snapshot.shard11
> drwxr-xr-x 2 root root 218 Jun 27 09:46 snapshot.shard9
> drwxr-xr-x 2 root root 180 Jun 27 09:46 snapshot.shard2
> drwxr-xr-x 2 root root 164 Jun 27 09:47 snapshot.shard5
> drwxr-xr-x 2 root root 252 Jun 27 09:47 snapshot.shard6
> drwxr-xr-x 2 root root 103 Jun 27 09:47 snapshot.shard12
> drwxr-xr-x 2 root root 135 Jun 27 09:47 snapshot.shard4
> drwxr-xr-x 2 root root 119 Jun 27 09:47 snapshot.shard10
> drwxr-xr-x 3 root root 4 Jun 27 09:47 zk_backup
> -rw-r--r-- 1 root root 185 Jun 27 09:47 backup.properties
>
> While an async backup creates files in this order:
> drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard3
> drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard9
> drwxr-xr-x 2 root root 62 Jun 27 09:49 snapshot.shard6
> drwxr-xr-x 2 root root 37 Jun 27 09:49 snapshot.shard2
> drwxr-xr-x 2 root root 67 Jun 27 09:49 snapshot.shard7
> drwxr-xr-x 2 root root 75 Jun 27 09:49 snapshot.shard5
> drwxr-xr-x 2 root root 70 Jun 27 09:49 snapshot.shard8
> drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard4
> drwxr-xr-x 2 root root 15 Jun 27 09:50 snapshot.shard11
> drwxr-xr-x 2 root root 127 Jun 27 09:50 snapshot.shard1
> drwxr-xr-x 2 root root 116 Jun 27 09:50 snapshot.shard12
> drwxr-xr-x 3 root root 4 Jun 27 09:50 zk_backup
> -rw-r--r-- 1 root root 185 Jun 27 09:50 backup.properties
> drwxr-xr-x 2 root root 25 Jun 27 09:51 snapshot.shard10
>
>
> shard10 is much larger than the other shards.
>
> From the logs:
> INFO - 2017-06-27 09:50:33.832; [ ] org.apache.solr.cloud.BackupCmd;
> Completed backing up ZK data for backupName=collection1
> INFO - 2017-06-27 09:50:33.800; [ ]
> org.apache.solr.handler.admin.CoreAdminOperation; Checking request status
> for : backup1103459705035055
> INFO - 2017-06-27 09:50:33.800; [ ]
> org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null
> path=/admin/cores
> params={qt=/admin/cores&requestid=backup1103459705035055&action=
> REQUESTSTATUS&wt=javabin&version=2}
> status=0 QTime=0
> INFO - 2017-06-27 09:51:33.405; [ ] org.apache.solr.handler.
> SnapShooter;
> Done creating backup snapshot: shard10 at file:///online/backup/
> collection1
>
> Has anyone seen this bug, or knows a workaround?
>
>
> On 27 June 2017 at 09:47, Damien Kamerman <da...@gmail.com> wrote:
>
> > Yes, the async command returns, and then I poll with REQUESTSTATUS.
> >
> > On 27 June 2017 at 01:24, Varun Thacker <va...@vthacker.in> wrote:
> >
> >> Hi Damien,
> >>
> >> A backup command with async is supposed to return early. It is start the
> >> backup process and return.
> >>
> >> Are you using the REQUESTSTATUS (
> >> http://lucene.apache.org/solr/guide/6_6/collections-api.html
> >> #collections-api
> >> ) API to validate if the backup is complete?
> >>
> >> On Sun, Jun 25, 2017 at 10:28 PM, Damien Kamerman <da...@gmail.com>
> >> wrote:
> >>
> >> > I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async
> >> > command returning early. The state is finished well before one shard
> is
> >> > finished.
> >> >
> >> > The collection I'm backing up has 12 shards across 6 nodes and I
> suspect
> >> > the issue is that it is not waiting for all backups on the node to
> >> finish.
> >> >
> >> > Alternatively, I if I change the request to not be async it works OK
> but
> >> > sometimes I get the exception "backup the collection time out:180s".
> >> >
> >> > Has anyone seen this, or knows a workaround?
> >> >
> >> > Cheers,
> >> > Damien.
> >> >
> >>
> >
> >
>
Re: async backup
Posted by Damien Kamerman <da...@gmail.com>.
A regular backup creates the files in this order:
drwxr-xr-x 2 root root 63 Jun 27 09:46 snapshot.shard7
drwxr-xr-x 2 root root 159 Jun 27 09:46 snapshot.shard8
drwxr-xr-x 2 root root 135 Jun 27 09:46 snapshot.shard1
drwxr-xr-x 2 root root 178 Jun 27 09:46 snapshot.shard3
drwxr-xr-x 2 root root 210 Jun 27 09:46 snapshot.shard11
drwxr-xr-x 2 root root 218 Jun 27 09:46 snapshot.shard9
drwxr-xr-x 2 root root 180 Jun 27 09:46 snapshot.shard2
drwxr-xr-x 2 root root 164 Jun 27 09:47 snapshot.shard5
drwxr-xr-x 2 root root 252 Jun 27 09:47 snapshot.shard6
drwxr-xr-x 2 root root 103 Jun 27 09:47 snapshot.shard12
drwxr-xr-x 2 root root 135 Jun 27 09:47 snapshot.shard4
drwxr-xr-x 2 root root 119 Jun 27 09:47 snapshot.shard10
drwxr-xr-x 3 root root 4 Jun 27 09:47 zk_backup
-rw-r--r-- 1 root root 185 Jun 27 09:47 backup.properties
While an async backup creates files in this order:
drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard3
drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard9
drwxr-xr-x 2 root root 62 Jun 27 09:49 snapshot.shard6
drwxr-xr-x 2 root root 37 Jun 27 09:49 snapshot.shard2
drwxr-xr-x 2 root root 67 Jun 27 09:49 snapshot.shard7
drwxr-xr-x 2 root root 75 Jun 27 09:49 snapshot.shard5
drwxr-xr-x 2 root root 70 Jun 27 09:49 snapshot.shard8
drwxr-xr-x 2 root root 15 Jun 27 09:49 snapshot.shard4
drwxr-xr-x 2 root root 15 Jun 27 09:50 snapshot.shard11
drwxr-xr-x 2 root root 127 Jun 27 09:50 snapshot.shard1
drwxr-xr-x 2 root root 116 Jun 27 09:50 snapshot.shard12
drwxr-xr-x 3 root root 4 Jun 27 09:50 zk_backup
-rw-r--r-- 1 root root 185 Jun 27 09:50 backup.properties
drwxr-xr-x 2 root root 25 Jun 27 09:51 snapshot.shard10
shard10 is much larger than the other shards.
From the logs:
INFO - 2017-06-27 09:50:33.832; [ ] org.apache.solr.cloud.BackupCmd;
Completed backing up ZK data for backupName=collection1
INFO - 2017-06-27 09:50:33.800; [ ]
org.apache.solr.handler.admin.CoreAdminOperation; Checking request status
for : backup1103459705035055
INFO - 2017-06-27 09:50:33.800; [ ]
org.apache.solr.servlet.HttpSolrCall; [admin] webapp=null path=/admin/cores
params={qt=/admin/cores&requestid=backup1103459705035055&action=REQUESTSTATUS&wt=javabin&version=2}
status=0 QTime=0
INFO - 2017-06-27 09:51:33.405; [ ] org.apache.solr.handler.SnapShooter;
Done creating backup snapshot: shard10 at file:///online/backup/collection1
Has anyone seen this bug, or knows a workaround?
On 27 June 2017 at 09:47, Damien Kamerman <da...@gmail.com> wrote:
> Yes, the async command returns, and then I poll with REQUESTSTATUS.
>
> On 27 June 2017 at 01:24, Varun Thacker <va...@vthacker.in> wrote:
>
>> Hi Damien,
>>
>> A backup command with async is supposed to return early. It is start the
>> backup process and return.
>>
>> Are you using the REQUESTSTATUS (
>> http://lucene.apache.org/solr/guide/6_6/collections-api.html
>> #collections-api
>> ) API to validate if the backup is complete?
>>
>> On Sun, Jun 25, 2017 at 10:28 PM, Damien Kamerman <da...@gmail.com>
>> wrote:
>>
>> > I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async
>> > command returning early. The state is finished well before one shard is
>> > finished.
>> >
>> > The collection I'm backing up has 12 shards across 6 nodes and I suspect
>> > the issue is that it is not waiting for all backups on the node to
>> finish.
>> >
>> > Alternatively, I if I change the request to not be async it works OK but
>> > sometimes I get the exception "backup the collection time out:180s".
>> >
>> > Has anyone seen this, or knows a workaround?
>> >
>> > Cheers,
>> > Damien.
>> >
>>
>
>
Re: async backup
Posted by Damien Kamerman <da...@gmail.com>.
Yes, the async command returns, and then I poll with REQUESTSTATUS.
On 27 June 2017 at 01:24, Varun Thacker <va...@vthacker.in> wrote:
> Hi Damien,
>
> A backup command with async is supposed to return early. It is start the
> backup process and return.
>
> Are you using the REQUESTSTATUS (
> http://lucene.apache.org/solr/guide/6_6/collections-api.
> html#collections-api
> ) API to validate if the backup is complete?
>
> On Sun, Jun 25, 2017 at 10:28 PM, Damien Kamerman <da...@gmail.com>
> wrote:
>
> > I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async
> > command returning early. The state is finished well before one shard is
> > finished.
> >
> > The collection I'm backing up has 12 shards across 6 nodes and I suspect
> > the issue is that it is not waiting for all backups on the node to
> finish.
> >
> > Alternatively, I if I change the request to not be async it works OK but
> > sometimes I get the exception "backup the collection time out:180s".
> >
> > Has anyone seen this, or knows a workaround?
> >
> > Cheers,
> > Damien.
> >
>
Re: async backup
Posted by Varun Thacker <va...@vthacker.in>.
Hi Damien,
A backup command with async is supposed to return early. It is start the
backup process and return.
Are you using the REQUESTSTATUS (
http://lucene.apache.org/solr/guide/6_6/collections-api.html#collections-api
) API to validate if the backup is complete?
On Sun, Jun 25, 2017 at 10:28 PM, Damien Kamerman <da...@gmail.com> wrote:
> I've noticed an issue with the Solr 6.5.1 Collections API BACKUP async
> command returning early. The state is finished well before one shard is
> finished.
>
> The collection I'm backing up has 12 shards across 6 nodes and I suspect
> the issue is that it is not waiting for all backups on the node to finish.
>
> Alternatively, I if I change the request to not be async it works OK but
> sometimes I get the exception "backup the collection time out:180s".
>
> Has anyone seen this, or knows a workaround?
>
> Cheers,
> Damien.
>