You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Brett Hoerner <br...@bretthoerner.com> on 2012/12/05 02:19:50 UTC

SolrCloud stops handling collection CREATE/DELETE (but responds HTTP 200)

Hi,

I have a Cloud setup of 4 machines. I bootstrapped them with 1 collection,
which I called "default" and haven't used since. I'm using an external ZK
ensemble that was completely empty before I started this cloud.

Once I had all 4 nodes in the cloud I used the collection API to create the
real collections I wanted. I also tested that deleting works.

For example,

# this worked
curl "
http://localhost:8984/solr/admin/collections?action=CREATE&name=15678&numShards=4
"

# this worked
curl "http://localhost:8984/solr/admin/collections?action=DELETE&name=15678"

Next, I started my indexer service which happily sent many, many updates to
the cloud. Queries against the collections also work just fine.

Finally, a few hours later, I tried doing a create and a delete. Both
operations did nothing, although Solr replied with a "200 OK".

$ curl -i "
http://localhost:8984/solr/admin/collections?action=CREATE&name=15679&numShards=4
"
HTTP/1.1 200 OK
Content-Type: application/xml; charset=UTF-8
Transfer-Encoding: chunked

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">3</int></lst>

There is nothing in the stdout/stderr logs, nor the Java logs (I have it
set to WARN).

I have tried bouncing the nodes and it doesn't change anything.

Any ideas? How can I further debug this or what else can I provide?

Re: SolrCloud stops handling collection CREATE/DELETE (but responds HTTP 200)

Posted by Mark Miller <ma...@gmail.com>.
Yeah it is - this was fixed a while ago on 4x and will be in 4.1.

An exception would kill the collection manager wait loop.

- Mark

On Sun, Dec 9, 2012 at 9:21 PM, Brett Hoerner <br...@bretthoerner.com> wrote:
> Thanks,
>
> It looks like my cluster is in a wedged state after I tried to delete a
> collection that didn't exist. There are about 80 items in the queue after
> the delete op (that it can't get by). Is that a known bug?
>
> I guess for now I'll just check that a collection exists before sending any
> deletes. :)
>
> Brett
>
>
> On Fri, Dec 7, 2012 at 10:50 AM, Mark Miller <ma...@gmail.com> wrote:
>
>> Anything in any of the other logs (the other nodes)? The key is getting
>> the logs from the node designated as the overseer - it should hopefully
>> have the error.
>>
>> Right now because you pass this stuff off to the overseer, you will always
>> get back a 200 - there is a JIRA issue that addresses this though
>> (collection API responses) and I hope to get it committed soon.
>>
>> - Mark
>>
>> On Dec 7, 2012, at 7:26 AM, Brett Hoerner <br...@bretthoerner.com> wrote:
>>
>> > For what it's worth this is the log output with DEBUG on,
>> >
>> > Dec 07, 2012 2:00:48 PM org.apache.solr.handler.admin.CollectionsHandler
>> > handleCreateAction
>> > INFO: Creating Collection : action=CREATE&name=foo&numShards=4
>> > Dec 07, 2012 2:01:03 PM org.apache.solr.core.SolrCore execute
>> > INFO: [15671] webapp=/solr path=/admin/system params={wt=json} status=0
>> > QTime=5
>> > Dec 07, 2012 2:01:15 PM org.apache.solr.handler.admin.CollectionsHandler
>> > handleDeleteAction
>> > INFO: Deleting Collection : action=DELETE&name=default
>> > Dec 07, 2012 2:01:20 PM org.apache.solr.core.SolrCore execute
>> >
>> > Neither the CREATE or DELETE actually did anything, though. (Again, HTTP
>> > 200 OK)
>> >
>> > Still stuck here, any ideas?
>> >
>> > Brett
>> >
>> >
>> > On Tue, Dec 4, 2012 at 7:19 PM, Brett Hoerner <brett@bretthoerner.com
>> >wrote:
>> >
>> >> Hi,
>> >>
>> >> I have a Cloud setup of 4 machines. I bootstrapped them with 1
>> collection,
>> >> which I called "default" and haven't used since. I'm using an external
>> ZK
>> >> ensemble that was completely empty before I started this cloud.
>> >>
>> >> Once I had all 4 nodes in the cloud I used the collection API to create
>> >> the real collections I wanted. I also tested that deleting works.
>> >>
>> >> For example,
>> >>
>> >> # this worked
>> >> curl "
>> >>
>> http://localhost:8984/solr/admin/collections?action=CREATE&name=15678&numShards=4
>> >> "
>> >>
>> >> # this worked
>> >> curl "
>> >> http://localhost:8984/solr/admin/collections?action=DELETE&name=15678"
>> >>
>> >> Next, I started my indexer service which happily sent many, many updates
>> >> to the cloud. Queries against the collections also work just fine.
>> >>
>> >> Finally, a few hours later, I tried doing a create and a delete. Both
>> >> operations did nothing, although Solr replied with a "200 OK".
>> >>
>> >> $ curl -i "
>> >>
>> http://localhost:8984/solr/admin/collections?action=CREATE&name=15679&numShards=4
>> >> "
>> >> HTTP/1.1 200 OK
>> >> Content-Type: application/xml; charset=UTF-8
>> >> Transfer-Encoding: chunked
>> >>
>> >> <?xml version="1.0" encoding="UTF-8"?>
>> >> <response>
>> >> <lst name="responseHeader"><int name="status">0</int><int
>> >> name="QTime">3</int></lst>
>> >>
>> >> There is nothing in the stdout/stderr logs, nor the Java logs (I have it
>> >> set to WARN).
>> >>
>> >> I have tried bouncing the nodes and it doesn't change anything.
>> >>
>> >> Any ideas? How can I further debug this or what else can I provide?
>> >>
>>
>>



-- 
- Mark

Re: SolrCloud stops handling collection CREATE/DELETE (but responds HTTP 200)

Posted by Brett Hoerner <br...@bretthoerner.com>.
Thanks,

It looks like my cluster is in a wedged state after I tried to delete a
collection that didn't exist. There are about 80 items in the queue after
the delete op (that it can't get by). Is that a known bug?

I guess for now I'll just check that a collection exists before sending any
deletes. :)

Brett


On Fri, Dec 7, 2012 at 10:50 AM, Mark Miller <ma...@gmail.com> wrote:

> Anything in any of the other logs (the other nodes)? The key is getting
> the logs from the node designated as the overseer - it should hopefully
> have the error.
>
> Right now because you pass this stuff off to the overseer, you will always
> get back a 200 - there is a JIRA issue that addresses this though
> (collection API responses) and I hope to get it committed soon.
>
> - Mark
>
> On Dec 7, 2012, at 7:26 AM, Brett Hoerner <br...@bretthoerner.com> wrote:
>
> > For what it's worth this is the log output with DEBUG on,
> >
> > Dec 07, 2012 2:00:48 PM org.apache.solr.handler.admin.CollectionsHandler
> > handleCreateAction
> > INFO: Creating Collection : action=CREATE&name=foo&numShards=4
> > Dec 07, 2012 2:01:03 PM org.apache.solr.core.SolrCore execute
> > INFO: [15671] webapp=/solr path=/admin/system params={wt=json} status=0
> > QTime=5
> > Dec 07, 2012 2:01:15 PM org.apache.solr.handler.admin.CollectionsHandler
> > handleDeleteAction
> > INFO: Deleting Collection : action=DELETE&name=default
> > Dec 07, 2012 2:01:20 PM org.apache.solr.core.SolrCore execute
> >
> > Neither the CREATE or DELETE actually did anything, though. (Again, HTTP
> > 200 OK)
> >
> > Still stuck here, any ideas?
> >
> > Brett
> >
> >
> > On Tue, Dec 4, 2012 at 7:19 PM, Brett Hoerner <brett@bretthoerner.com
> >wrote:
> >
> >> Hi,
> >>
> >> I have a Cloud setup of 4 machines. I bootstrapped them with 1
> collection,
> >> which I called "default" and haven't used since. I'm using an external
> ZK
> >> ensemble that was completely empty before I started this cloud.
> >>
> >> Once I had all 4 nodes in the cloud I used the collection API to create
> >> the real collections I wanted. I also tested that deleting works.
> >>
> >> For example,
> >>
> >> # this worked
> >> curl "
> >>
> http://localhost:8984/solr/admin/collections?action=CREATE&name=15678&numShards=4
> >> "
> >>
> >> # this worked
> >> curl "
> >> http://localhost:8984/solr/admin/collections?action=DELETE&name=15678"
> >>
> >> Next, I started my indexer service which happily sent many, many updates
> >> to the cloud. Queries against the collections also work just fine.
> >>
> >> Finally, a few hours later, I tried doing a create and a delete. Both
> >> operations did nothing, although Solr replied with a "200 OK".
> >>
> >> $ curl -i "
> >>
> http://localhost:8984/solr/admin/collections?action=CREATE&name=15679&numShards=4
> >> "
> >> HTTP/1.1 200 OK
> >> Content-Type: application/xml; charset=UTF-8
> >> Transfer-Encoding: chunked
> >>
> >> <?xml version="1.0" encoding="UTF-8"?>
> >> <response>
> >> <lst name="responseHeader"><int name="status">0</int><int
> >> name="QTime">3</int></lst>
> >>
> >> There is nothing in the stdout/stderr logs, nor the Java logs (I have it
> >> set to WARN).
> >>
> >> I have tried bouncing the nodes and it doesn't change anything.
> >>
> >> Any ideas? How can I further debug this or what else can I provide?
> >>
>
>

Re: SolrCloud stops handling collection CREATE/DELETE (but responds HTTP 200)

Posted by Mark Miller <ma...@gmail.com>.
Anything in any of the other logs (the other nodes)? The key is getting the logs from the node designated as the overseer - it should hopefully have the error.

Right now because you pass this stuff off to the overseer, you will always get back a 200 - there is a JIRA issue that addresses this though (collection API responses) and I hope to get it committed soon.

- Mark

On Dec 7, 2012, at 7:26 AM, Brett Hoerner <br...@bretthoerner.com> wrote:

> For what it's worth this is the log output with DEBUG on,
> 
> Dec 07, 2012 2:00:48 PM org.apache.solr.handler.admin.CollectionsHandler
> handleCreateAction
> INFO: Creating Collection : action=CREATE&name=foo&numShards=4
> Dec 07, 2012 2:01:03 PM org.apache.solr.core.SolrCore execute
> INFO: [15671] webapp=/solr path=/admin/system params={wt=json} status=0
> QTime=5
> Dec 07, 2012 2:01:15 PM org.apache.solr.handler.admin.CollectionsHandler
> handleDeleteAction
> INFO: Deleting Collection : action=DELETE&name=default
> Dec 07, 2012 2:01:20 PM org.apache.solr.core.SolrCore execute
> 
> Neither the CREATE or DELETE actually did anything, though. (Again, HTTP
> 200 OK)
> 
> Still stuck here, any ideas?
> 
> Brett
> 
> 
> On Tue, Dec 4, 2012 at 7:19 PM, Brett Hoerner <br...@bretthoerner.com>wrote:
> 
>> Hi,
>> 
>> I have a Cloud setup of 4 machines. I bootstrapped them with 1 collection,
>> which I called "default" and haven't used since. I'm using an external ZK
>> ensemble that was completely empty before I started this cloud.
>> 
>> Once I had all 4 nodes in the cloud I used the collection API to create
>> the real collections I wanted. I also tested that deleting works.
>> 
>> For example,
>> 
>> # this worked
>> curl "
>> http://localhost:8984/solr/admin/collections?action=CREATE&name=15678&numShards=4
>> "
>> 
>> # this worked
>> curl "
>> http://localhost:8984/solr/admin/collections?action=DELETE&name=15678"
>> 
>> Next, I started my indexer service which happily sent many, many updates
>> to the cloud. Queries against the collections also work just fine.
>> 
>> Finally, a few hours later, I tried doing a create and a delete. Both
>> operations did nothing, although Solr replied with a "200 OK".
>> 
>> $ curl -i "
>> http://localhost:8984/solr/admin/collections?action=CREATE&name=15679&numShards=4
>> "
>> HTTP/1.1 200 OK
>> Content-Type: application/xml; charset=UTF-8
>> Transfer-Encoding: chunked
>> 
>> <?xml version="1.0" encoding="UTF-8"?>
>> <response>
>> <lst name="responseHeader"><int name="status">0</int><int
>> name="QTime">3</int></lst>
>> 
>> There is nothing in the stdout/stderr logs, nor the Java logs (I have it
>> set to WARN).
>> 
>> I have tried bouncing the nodes and it doesn't change anything.
>> 
>> Any ideas? How can I further debug this or what else can I provide?
>> 


Re: SolrCloud stops handling collection CREATE/DELETE (but responds HTTP 200)

Posted by Brett Hoerner <br...@bretthoerner.com>.
For what it's worth this is the log output with DEBUG on,

Dec 07, 2012 2:00:48 PM org.apache.solr.handler.admin.CollectionsHandler
handleCreateAction
INFO: Creating Collection : action=CREATE&name=foo&numShards=4
Dec 07, 2012 2:01:03 PM org.apache.solr.core.SolrCore execute
INFO: [15671] webapp=/solr path=/admin/system params={wt=json} status=0
QTime=5
Dec 07, 2012 2:01:15 PM org.apache.solr.handler.admin.CollectionsHandler
handleDeleteAction
INFO: Deleting Collection : action=DELETE&name=default
Dec 07, 2012 2:01:20 PM org.apache.solr.core.SolrCore execute

Neither the CREATE or DELETE actually did anything, though. (Again, HTTP
200 OK)

Still stuck here, any ideas?

Brett


On Tue, Dec 4, 2012 at 7:19 PM, Brett Hoerner <br...@bretthoerner.com>wrote:

> Hi,
>
> I have a Cloud setup of 4 machines. I bootstrapped them with 1 collection,
> which I called "default" and haven't used since. I'm using an external ZK
> ensemble that was completely empty before I started this cloud.
>
> Once I had all 4 nodes in the cloud I used the collection API to create
> the real collections I wanted. I also tested that deleting works.
>
> For example,
>
> # this worked
> curl "
> http://localhost:8984/solr/admin/collections?action=CREATE&name=15678&numShards=4
> "
>
> # this worked
> curl "
> http://localhost:8984/solr/admin/collections?action=DELETE&name=15678"
>
> Next, I started my indexer service which happily sent many, many updates
> to the cloud. Queries against the collections also work just fine.
>
> Finally, a few hours later, I tried doing a create and a delete. Both
> operations did nothing, although Solr replied with a "200 OK".
>
> $ curl -i "
> http://localhost:8984/solr/admin/collections?action=CREATE&name=15679&numShards=4
> "
> HTTP/1.1 200 OK
> Content-Type: application/xml; charset=UTF-8
> Transfer-Encoding: chunked
>
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
> <lst name="responseHeader"><int name="status">0</int><int
> name="QTime">3</int></lst>
>
> There is nothing in the stdout/stderr logs, nor the Java logs (I have it
> set to WARN).
>
> I have tried bouncing the nodes and it doesn't change anything.
>
> Any ideas? How can I further debug this or what else can I provide?
>