You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Simon Eisenmann <si...@struktur.de> on 2009/11/04 12:15:10 UTC

Continous Replication hangs with "start"

Hi,

i have been running continuous replication for the past three weeks on
three nodes each replicating to each other with continuous changes on
every node. All running fine until one node crashed cause of a full
disk.

I emptied disk space and restarted this node. This replication never
seems to start again. Status in futon keep saying "Starting".

I tried to restart all the nodes with no luck, all having the same
issues.

See the errors in the couchdb.log below.

I am running CouchDB 0.10.0 on Erlang R13B.

I hope somebody has some sort of idea what is going on.


Thanks
Simon


[Wed, 04 Nov 2009 11:10:19 GMT] [error] [<0.2114.0>] ** Generic server
<0.2114.0> terminating 
** Last message in was {tcp,#Port<0.2219>,
                            <<"HTTP/1.1 500 Internal Server Error\r
\nServer: CouchDB/0.10.0 (Erlang OTP/R13B)\r\nDate: Wed, 04 Nov 2009
11:10:19 GMT\r\nContent-Type: application/json\r\nContent-Length: 114\r
\nCache-Control: must-revalidate\r\n\r\n{\"error\":\"json_encode\",
\"reason\":
\"{bad_term,{26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}}\"}\n">>}
** When Server state ==
{state,"10.1.1.31",5984,false,undefined,[],false,
                            #Port<0.2219>,
                            {[{request,
                                  {url,

"http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
2Fredemption.intranet.struktur.de:2425?open_revs=[\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",

"10.1.1.31",5984,undefined,undefined,
                                      "/gangstercluster_1/ping_http:%2F%
2Fredemption.intranet.struktur.de:2425?open_revs=[\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
                                      http},
                                  get,
                                  [{response_format,binary},
                                   {inactivity_timeout,30000}],
                                  {<0.2084.0>,#Ref<0.0.3.30137>},
                                  undefined,false,
                                  {1257,333019,329755},

1048576,false,undefined,undefined,binary}],
                             []},
                            {request,
                                {url,

"http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
2Fredemption.intranet.struktur.de:2425?open_revs=[\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",

"10.1.1.31",5984,undefined,undefined,
                                    "/gangstercluster_1/ping_http:%2F%
2Fredemption.intranet.struktur.de:2425?open_revs=[\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
                                    http},
                                get,
                                [{response_format,binary},
                                 {inactivity_timeout,30000}],
                                {<0.2084.0>,#Ref<0.0.3.30137>},
                                undefined,false,
                                {1257,333019,329755},

1048576,false,undefined,undefined,binary},
                            get_body,"200",<<"[">>,0,0,
                            [{"Transfer-Encoding","chunked"},
                             {"Server","CouchDB/0.10.0 (Erlang
OTP/R13B)"},
                             {"Date","Wed, 04 Nov 2009 11:10:19 GMT"},
                             {"Content-Type","application/json"},
                             {"Cache-Control","must-revalidate"}],
                            false,undefined,undefined,true,chunked,
                            chunk_start,<<>>,0,262182,1,undefined}
** Reason for termination == 
** {function_clause,[{ibrowse_http_client,to_ascii,"r"},
                     {ibrowse_http_client,hexlist_to_integer,3},
                     {ibrowse_http_client,parse_11_response,2},
                     {ibrowse_http_client,handle_sock_data,2},
                     {gen_server,handle_msg,5},
                     {proc_lib,init_p_do_apply,3}]}


[Wed, 04 Nov 2009 11:10:19 GMT] [error] [<0.2114.0>]
{error_report,<0.24.0>,
    {<0.2114.0>,crash_report,
     [[{initial_call,{ibrowse_http_client,init,['Argument__1']}},
       {pid,<0.2114.0>},
       {registered_name,[]},
       {error_info,
           {exit,
               {function_clause,
                   [{ibrowse_http_client,to_ascii,"r"},
                    {ibrowse_http_client,hexlist_to_integer,3},
                    {ibrowse_http_client,parse_11_response,2},
                    {ibrowse_http_client,handle_sock_data,2},
                    {gen_server,handle_msg,5},
                    {proc_lib,init_p_do_apply,3}]},

[{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
       {ancestors,[<0.139.0>,ibrowse,<0.1.0>]},
       {messages,[]},
       {links,[<0.139.0>]},
       {dictionary,
           [{my_trace_flag,false},
            {ibrowse_trace_token,["10.1.1.31",58,"5984"]},
            {http_prot_vsn,"HTTP/1.1"},
            {conn_close,"false"}]},
       {trap_exit,false},
       {status,running},
       {heap_size,4181},
       {stack_size,24},
       {reductions,1570}],
      []]}}


[Wed, 04 Nov 2009 11:12:25 GMT] [error] [<0.2339.0>] ** Generic server
<0.2339.0> terminating 
** Last message in was {tcp,#Port<0.2310>,
                            <<"HTTP/1.1 500 Internal Server Error\r
\nServer: CouchDB/0.10.0 (Erlang OTP/R13B)\r\nDate: Wed, 04 Nov 2009
11:12:25 GMT\r\nContent-Type: application/json\r\nContent-Length: 114\r
\nCache-Control: must-revalidate\r\n\r\n{\"error\":\"json_encode\",
\"reason\":
\"{bad_term,{26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}}\"}\n">>}
** When Server state ==
{state,"10.1.1.31",5984,false,undefined,[],false,
                            #Port<0.2310>,
                            {[{request,
                                  {url,

"http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
2Fredemption.intranet.struktur.de:2425?open_revs=[\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",

"10.1.1.31",5984,undefined,undefined,
                                      "/gangstercluster_1/ping_http:%2F%
2Fredemption.intranet.struktur.de:2425?open_revs=[\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
                                      http},
                                  get,
                                  [{response_format,binary},
                                   {inactivity_timeout,30000}],
                                  {<0.2084.0>,#Ref<0.0.3.120029>},
                                  undefined,false,
                                  {1257,333145,499104},

1048576,false,undefined,undefined,binary}],
                             []},
                            {request,
                                {url,

"http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
2Fredemption.intranet.struktur.de:2425?open_revs=[\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",

"10.1.1.31",5984,undefined,undefined,
                                    "/gangstercluster_1/ping_http:%2F%
2Fredemption.intranet.struktur.de:2425?open_revs=[\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
                                    http},
                                get,
                                [{response_format,binary},
                                 {inactivity_timeout,30000}],
                                {<0.2084.0>,#Ref<0.0.3.120029>},
                                undefined,false,
                                {1257,333145,499104},

1048576,false,undefined,undefined,binary},
                            get_body,"200",<<"[">>,0,0,
                            [{"Transfer-Encoding","chunked"},
                             {"Server","CouchDB/0.10.0 (Erlang
OTP/R13B)"},
                             {"Date","Wed, 04 Nov 2009 11:12:25 GMT"},
                             {"Content-Type","application/json"},
                             {"Cache-Control","must-revalidate"}],
                            false,undefined,undefined,true,chunked,
                            chunk_start,<<>>,0,311334,1,undefined}
** Reason for termination == 
** {function_clause,[{ibrowse_http_client,to_ascii,"r"},
                     {ibrowse_http_client,hexlist_to_integer,3},
                     {ibrowse_http_client,parse_11_response,2},
                     {ibrowse_http_client,handle_sock_data,2},
                     {gen_server,handle_msg,5},
                     {proc_lib,init_p_do_apply,3}]}



-- 
Simon Eisenmann

[ mailto:simon@struktur.de ]

[ struktur AG | Kronenstraße 22a | D-70173 Stuttgart ]
[ T. +49.711.896656.0 | F.+49.711.89665610 ]
[ http://www.struktur.de | mailto:info@struktur.de ]

Re: Continous Replication hangs with "start"

Posted by Jan Lehnardt <ja...@apache.org>.
On 5 Nov 2009, at 10:43, Simon Eisenmann wrote:

> Hi,
>
> Am Mittwoch, den 04.11.2009, 08:42 -0500 schrieb Adam Kocoloski:
>> Hi Simon, looks like replication is failing on the following URL:
>>
>> http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%2Fredemption.intranet.struktur.de:2425?open_revs=[\
>> "26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true
>>
>> It gets back a response that looks like
>>
>> HTTP/1.1 500 Internal Server Error
>> Server: CouchDB/0.10.0 (Erlang OTP/R13B)
>> Date: Wed, 04 Nov 2009 11:10:19 GMT
>> Content-Type: application/json
>> Content-Length: 114
>> Cache-Control: must-revalidate
>>
>> {\"error\":\"json_encode\",\"reason\":\"{bad_term,
>> {26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}} 
>> \"}
>> \n"
>>
>> You could try the URL for yourself and confirm this to be the case.
>
> If i try to load this URL i get
>
> {"error":"bad_request","reason":"invalid UTF-8 JSON"}
>
> Mhm not sure why i get this instead of a 500.
>
>> The problem is probably that revision 26621-.... of this document is
>> missing.  0.10.0 had a bug where if you explicitly requested a  
>> missing
>> revision, you'd get a 500 Internal Server Response.  The replicator
>> isn't in the habit of making these kinds of requests, but it's
>> probably some interaction with the full disk problem.
>> I fixed this bug in r829919 (trunk) or r829924 (0.10.x branch, didn't
>> make 0.10.0), so if you upgraded it would likely allow replication to
>> proceed.  Otherwise, I think your recourse is to make some update to
>> this document so that the replicator doesn't try to request this
>> revision anymore.  Hope it helps,
>>
>
> Ok let me try that. I will come back as soon i have the time to  
> compile
> the latest revision of the 0.10.x branch.
>
> Are there already any plans for a 0.10.1 release?

Yes, we are in the final preparations to start a vote. :)

Cheers
Jan
--

>
> Simon
>
>
>> Adam
>>
>> On Nov 4, 2009, at 6:15 AM, Simon Eisenmann wrote:
>>
>>> Hi,
>>>
>>> i have been running continuous replication for the past three  
>>> weeks on
>>> three nodes each replicating to each other with continuous changes  
>>> on
>>> every node. All running fine until one node crashed cause of a full
>>> disk.
>>>
>>> I emptied disk space and restarted this node. This replication never
>>> seems to start again. Status in futon keep saying "Starting".
>>>
>>> I tried to restart all the nodes with no luck, all having the same
>>> issues.
>>>
>>> See the errors in the couchdb.log below.
>>>
>>> I am running CouchDB 0.10.0 on Erlang R13B.
>>>
>>> I hope somebody has some sort of idea what is going on.
>>>
>>>
>>> Thanks
>>> Simon
>>>
>>>
>>> [Wed, 04 Nov 2009 11:10:19 GMT] [error] [<0.2114.0>] ** Generic  
>>> server
>>> <0.2114.0> terminating
>>> ** Last message in was {tcp,#Port<0.2219>,
>>>                           <<"HTTP/1.1 500 Internal Server Error\r
>>> \nServer: CouchDB/0.10.0 (Erlang OTP/R13B)\r\nDate: Wed, 04 Nov 2009
>>> 11:10:19 GMT\r\nContent-Type: application/json\r\nContent-Length:
>>> 114\r
>>> \nCache-Control: must-revalidate\r\n\r\n{\"error\":\"json_encode\",
>>> \"reason\":
>>> \"{bad_term,
>>> {26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}}
>>> \"}\n">>}
>>> ** When Server state ==
>>> {state,"10.1.1.31",5984,false,undefined,[],false,
>>>                           #Port<0.2219>,
>>>                           {[{request,
>>>                                 {url,
>>>
>>> "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
>>> 2Fredemption.intranet.struktur.de:2425?open_revs=
>>> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>>>
>>> "10.1.1.31",5984,undefined,undefined,
>>>                                     "/gangstercluster_1/ping_http:
>>> %2F%
>>> 2Fredemption.intranet.struktur.de:2425?open_revs=
>>> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>>>                                     http},
>>>                                 get,
>>>                                 [{response_format,binary},
>>>                                  {inactivity_timeout,30000}],
>>>                                 {<0.2084.0>,#Ref<0.0.3.30137>},
>>>                                 undefined,false,
>>>                                 {1257,333019,329755},
>>>
>>> 1048576,false,undefined,undefined,binary}],
>>>                            []},
>>>                           {request,
>>>                               {url,
>>>
>>> "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
>>> 2Fredemption.intranet.struktur.de:2425?open_revs=
>>> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>>>
>>> "10.1.1.31",5984,undefined,undefined,
>>>                                   "/gangstercluster_1/ping_http:%2F%
>>> 2Fredemption.intranet.struktur.de:2425?open_revs=
>>> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>>>                                   http},
>>>                               get,
>>>                               [{response_format,binary},
>>>                                {inactivity_timeout,30000}],
>>>                               {<0.2084.0>,#Ref<0.0.3.30137>},
>>>                               undefined,false,
>>>                               {1257,333019,329755},
>>>
>>> 1048576,false,undefined,undefined,binary},
>>>                           get_body,"200",<<"[">>,0,0,
>>>                           [{"Transfer-Encoding","chunked"},
>>>                            {"Server","CouchDB/0.10.0 (Erlang
>>> OTP/R13B)"},
>>>                            {"Date","Wed, 04 Nov 2009 11:10:19 GMT"},
>>>                            {"Content-Type","application/json"},
>>>                            {"Cache-Control","must-revalidate"}],
>>>                           false,undefined,undefined,true,chunked,
>>>                           chunk_start,<<>>,0,262182,1,undefined}
>>> ** Reason for termination ==
>>> ** {function_clause,[{ibrowse_http_client,to_ascii,"r"},
>>>                    {ibrowse_http_client,hexlist_to_integer,3},
>>>                    {ibrowse_http_client,parse_11_response,2},
>>>                    {ibrowse_http_client,handle_sock_data,2},
>>>                    {gen_server,handle_msg,5},
>>>                    {proc_lib,init_p_do_apply,3}]}
>>>
>>>
>>> [Wed, 04 Nov 2009 11:10:19 GMT] [error] [<0.2114.0>]
>>> {error_report,<0.24.0>,
>>>   {<0.2114.0>,crash_report,
>>>    [[{initial_call,{ibrowse_http_client,init,['Argument__1']}},
>>>      {pid,<0.2114.0>},
>>>      {registered_name,[]},
>>>      {error_info,
>>>          {exit,
>>>              {function_clause,
>>>                  [{ibrowse_http_client,to_ascii,"r"},
>>>                   {ibrowse_http_client,hexlist_to_integer,3},
>>>                   {ibrowse_http_client,parse_11_response,2},
>>>                   {ibrowse_http_client,handle_sock_data,2},
>>>                   {gen_server,handle_msg,5},
>>>                   {proc_lib,init_p_do_apply,3}]},
>>>
>>> [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
>>>      {ancestors,[<0.139.0>,ibrowse,<0.1.0>]},
>>>      {messages,[]},
>>>      {links,[<0.139.0>]},
>>>      {dictionary,
>>>          [{my_trace_flag,false},
>>>           {ibrowse_trace_token,["10.1.1.31",58,"5984"]},
>>>           {http_prot_vsn,"HTTP/1.1"},
>>>           {conn_close,"false"}]},
>>>      {trap_exit,false},
>>>      {status,running},
>>>      {heap_size,4181},
>>>      {stack_size,24},
>>>      {reductions,1570}],
>>>     []]}}
>>>
>>>
>>> [Wed, 04 Nov 2009 11:12:25 GMT] [error] [<0.2339.0>] ** Generic  
>>> server
>>> <0.2339.0> terminating
>>> ** Last message in was {tcp,#Port<0.2310>,
>>>                           <<"HTTP/1.1 500 Internal Server Error\r
>>> \nServer: CouchDB/0.10.0 (Erlang OTP/R13B)\r\nDate: Wed, 04 Nov 2009
>>> 11:12:25 GMT\r\nContent-Type: application/json\r\nContent-Length:
>>> 114\r
>>> \nCache-Control: must-revalidate\r\n\r\n{\"error\":\"json_encode\",
>>> \"reason\":
>>> \"{bad_term,
>>> {26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}}
>>> \"}\n">>}
>>> ** When Server state ==
>>> {state,"10.1.1.31",5984,false,undefined,[],false,
>>>                           #Port<0.2310>,
>>>                           {[{request,
>>>                                 {url,
>>>
>>> "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
>>> 2Fredemption.intranet.struktur.de:2425?open_revs=
>>> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>>>
>>> "10.1.1.31",5984,undefined,undefined,
>>>                                     "/gangstercluster_1/ping_http:
>>> %2F%
>>> 2Fredemption.intranet.struktur.de:2425?open_revs=
>>> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>>>                                     http},
>>>                                 get,
>>>                                 [{response_format,binary},
>>>                                  {inactivity_timeout,30000}],
>>>                                 {<0.2084.0>,#Ref<0.0.3.120029>},
>>>                                 undefined,false,
>>>                                 {1257,333145,499104},
>>>
>>> 1048576,false,undefined,undefined,binary}],
>>>                            []},
>>>                           {request,
>>>                               {url,
>>>
>>> "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
>>> 2Fredemption.intranet.struktur.de:2425?open_revs=
>>> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>>>
>>> "10.1.1.31",5984,undefined,undefined,
>>>                                   "/gangstercluster_1/ping_http:%2F%
>>> 2Fredemption.intranet.struktur.de:2425?open_revs=
>>> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>>>                                   http},
>>>                               get,
>>>                               [{response_format,binary},
>>>                                {inactivity_timeout,30000}],
>>>                               {<0.2084.0>,#Ref<0.0.3.120029>},
>>>                               undefined,false,
>>>                               {1257,333145,499104},
>>>
>>> 1048576,false,undefined,undefined,binary},
>>>                           get_body,"200",<<"[">>,0,0,
>>>                           [{"Transfer-Encoding","chunked"},
>>>                            {"Server","CouchDB/0.10.0 (Erlang
>>> OTP/R13B)"},
>>>                            {"Date","Wed, 04 Nov 2009 11:12:25 GMT"},
>>>                            {"Content-Type","application/json"},
>>>                            {"Cache-Control","must-revalidate"}],
>>>                           false,undefined,undefined,true,chunked,
>>>                           chunk_start,<<>>,0,311334,1,undefined}
>>> ** Reason for termination ==
>>> ** {function_clause,[{ibrowse_http_client,to_ascii,"r"},
>>>                    {ibrowse_http_client,hexlist_to_integer,3},
>>>                    {ibrowse_http_client,parse_11_response,2},
>>>                    {ibrowse_http_client,handle_sock_data,2},
>>>                    {gen_server,handle_msg,5},
>>>                    {proc_lib,init_p_do_apply,3}]}
>>>
>>>
>>>
>>> -- 
>>> Simon Eisenmann
>>>
>>> [ mailto:simon@struktur.de ]
>>>
>>> [ struktur AG | Kronenstraße 22a | D-70173 Stuttgart ]
>>> [ T. +49.711.896656.0 | F.+49.711.89665610 ]
>>> [ http://www.struktur.de | mailto:info@struktur.de ]
>>
> -- 
> Simon Eisenmann
>
> [ mailto:simon@struktur.de ]
>
> [ struktur AG | Kronenstraße 22a | D-70173 Stuttgart ]
> [ T. +49.711.896656.68 | F.+49.711.89665610 ]
> [ http://www.struktur.de | mailto:info@struktur.de ]


Re: Continous Replication hangs with "start"

Posted by Simon Eisenmann <si...@struktur.de>.
Hi,

i was not able to reproduce this problem with 0.10.1. CouchDB runs like
a charm. Great work!

Best regards
Simon

Am Donnerstag, den 05.11.2009, 10:43 +0100 schrieb Simon Eisenmann:
> > The problem is probably that revision 26621-.... of this document
> is  
> > missing.  0.10.0 had a bug where if you explicitly requested a
> missing  
> > revision, you'd get a 500 Internal Server Response.  The
> replicator  
> > isn't in the habit of making these kinds of requests, but it's  
> > probably some interaction with the full disk problem.
> > I fixed this bug in r829919 (trunk) or r829924 (0.10.x branch,
> didn't  
> > make 0.10.0), so if you upgraded it would likely allow replication
> to  
> > proceed.  Otherwise, I think your recourse is to make some update
> to  
> > this document so that the replicator doesn't try to request this  
> > revision anymore.  Hope it helps,
> > 
> 
> Ok let me try that. I will come back as soon i have the time to
> compile
> the latest revision of the 0.10.x branch.
> 
> Are there already any plans for a 0.10.1 release?
-- 
Simon Eisenmann

[ mailto:simon@struktur.de ]

[ struktur AG | Kronenstraße 22a | D-70173 Stuttgart ]
[ T. +49.711.896656.68 | F.+49.711.89665610 ]
[ http://www.struktur.de | mailto:info@struktur.de ]

Re: Continous Replication hangs with "start"

Posted by Simon Eisenmann <si...@struktur.de>.
Hi,

Am Mittwoch, den 04.11.2009, 08:42 -0500 schrieb Adam Kocoloski:
> Hi Simon, looks like replication is failing on the following URL:
> 
> http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%2Fredemption.intranet.struktur.de:2425?open_revs=[\ 
> "26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true
> 
> It gets back a response that looks like
> 
> HTTP/1.1 500 Internal Server Error
> Server: CouchDB/0.10.0 (Erlang OTP/R13B)
> Date: Wed, 04 Nov 2009 11:10:19 GMT
> Content-Type: application/json
> Content-Length: 114
> Cache-Control: must-revalidate
> 
> {\"error\":\"json_encode\",\"reason\":\"{bad_term, 
> {26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}}\"} 
> \n"
> 
> You could try the URL for yourself and confirm this to be the case.   

If i try to load this URL i get 

{"error":"bad_request","reason":"invalid UTF-8 JSON"}

Mhm not sure why i get this instead of a 500.

> The problem is probably that revision 26621-.... of this document is  
> missing.  0.10.0 had a bug where if you explicitly requested a missing  
> revision, you'd get a 500 Internal Server Response.  The replicator  
> isn't in the habit of making these kinds of requests, but it's  
> probably some interaction with the full disk problem.
> I fixed this bug in r829919 (trunk) or r829924 (0.10.x branch, didn't  
> make 0.10.0), so if you upgraded it would likely allow replication to  
> proceed.  Otherwise, I think your recourse is to make some update to  
> this document so that the replicator doesn't try to request this  
> revision anymore.  Hope it helps,
> 

Ok let me try that. I will come back as soon i have the time to compile
the latest revision of the 0.10.x branch.

Are there already any plans for a 0.10.1 release?

Simon


> Adam
> 
> On Nov 4, 2009, at 6:15 AM, Simon Eisenmann wrote:
> 
> > Hi,
> >
> > i have been running continuous replication for the past three weeks on
> > three nodes each replicating to each other with continuous changes on
> > every node. All running fine until one node crashed cause of a full
> > disk.
> >
> > I emptied disk space and restarted this node. This replication never
> > seems to start again. Status in futon keep saying "Starting".
> >
> > I tried to restart all the nodes with no luck, all having the same
> > issues.
> >
> > See the errors in the couchdb.log below.
> >
> > I am running CouchDB 0.10.0 on Erlang R13B.
> >
> > I hope somebody has some sort of idea what is going on.
> >
> >
> > Thanks
> > Simon
> >
> >
> > [Wed, 04 Nov 2009 11:10:19 GMT] [error] [<0.2114.0>] ** Generic server
> > <0.2114.0> terminating
> > ** Last message in was {tcp,#Port<0.2219>,
> >                            <<"HTTP/1.1 500 Internal Server Error\r
> > \nServer: CouchDB/0.10.0 (Erlang OTP/R13B)\r\nDate: Wed, 04 Nov 2009
> > 11:10:19 GMT\r\nContent-Type: application/json\r\nContent-Length:  
> > 114\r
> > \nCache-Control: must-revalidate\r\n\r\n{\"error\":\"json_encode\",
> > \"reason\":
> > \"{bad_term, 
> > {26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}} 
> > \"}\n">>}
> > ** When Server state ==
> > {state,"10.1.1.31",5984,false,undefined,[],false,
> >                            #Port<0.2219>,
> >                            {[{request,
> >                                  {url,
> >
> > "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
> > 2Fredemption.intranet.struktur.de:2425?open_revs= 
> > [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
> >
> > "10.1.1.31",5984,undefined,undefined,
> >                                      "/gangstercluster_1/ping_http: 
> > %2F%
> > 2Fredemption.intranet.struktur.de:2425?open_revs= 
> > [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
> >                                      http},
> >                                  get,
> >                                  [{response_format,binary},
> >                                   {inactivity_timeout,30000}],
> >                                  {<0.2084.0>,#Ref<0.0.3.30137>},
> >                                  undefined,false,
> >                                  {1257,333019,329755},
> >
> > 1048576,false,undefined,undefined,binary}],
> >                             []},
> >                            {request,
> >                                {url,
> >
> > "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
> > 2Fredemption.intranet.struktur.de:2425?open_revs= 
> > [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
> >
> > "10.1.1.31",5984,undefined,undefined,
> >                                    "/gangstercluster_1/ping_http:%2F%
> > 2Fredemption.intranet.struktur.de:2425?open_revs= 
> > [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
> >                                    http},
> >                                get,
> >                                [{response_format,binary},
> >                                 {inactivity_timeout,30000}],
> >                                {<0.2084.0>,#Ref<0.0.3.30137>},
> >                                undefined,false,
> >                                {1257,333019,329755},
> >
> > 1048576,false,undefined,undefined,binary},
> >                            get_body,"200",<<"[">>,0,0,
> >                            [{"Transfer-Encoding","chunked"},
> >                             {"Server","CouchDB/0.10.0 (Erlang
> > OTP/R13B)"},
> >                             {"Date","Wed, 04 Nov 2009 11:10:19 GMT"},
> >                             {"Content-Type","application/json"},
> >                             {"Cache-Control","must-revalidate"}],
> >                            false,undefined,undefined,true,chunked,
> >                            chunk_start,<<>>,0,262182,1,undefined}
> > ** Reason for termination ==
> > ** {function_clause,[{ibrowse_http_client,to_ascii,"r"},
> >                     {ibrowse_http_client,hexlist_to_integer,3},
> >                     {ibrowse_http_client,parse_11_response,2},
> >                     {ibrowse_http_client,handle_sock_data,2},
> >                     {gen_server,handle_msg,5},
> >                     {proc_lib,init_p_do_apply,3}]}
> >
> >
> > [Wed, 04 Nov 2009 11:10:19 GMT] [error] [<0.2114.0>]
> > {error_report,<0.24.0>,
> >    {<0.2114.0>,crash_report,
> >     [[{initial_call,{ibrowse_http_client,init,['Argument__1']}},
> >       {pid,<0.2114.0>},
> >       {registered_name,[]},
> >       {error_info,
> >           {exit,
> >               {function_clause,
> >                   [{ibrowse_http_client,to_ascii,"r"},
> >                    {ibrowse_http_client,hexlist_to_integer,3},
> >                    {ibrowse_http_client,parse_11_response,2},
> >                    {ibrowse_http_client,handle_sock_data,2},
> >                    {gen_server,handle_msg,5},
> >                    {proc_lib,init_p_do_apply,3}]},
> >
> > [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
> >       {ancestors,[<0.139.0>,ibrowse,<0.1.0>]},
> >       {messages,[]},
> >       {links,[<0.139.0>]},
> >       {dictionary,
> >           [{my_trace_flag,false},
> >            {ibrowse_trace_token,["10.1.1.31",58,"5984"]},
> >            {http_prot_vsn,"HTTP/1.1"},
> >            {conn_close,"false"}]},
> >       {trap_exit,false},
> >       {status,running},
> >       {heap_size,4181},
> >       {stack_size,24},
> >       {reductions,1570}],
> >      []]}}
> >
> >
> > [Wed, 04 Nov 2009 11:12:25 GMT] [error] [<0.2339.0>] ** Generic server
> > <0.2339.0> terminating
> > ** Last message in was {tcp,#Port<0.2310>,
> >                            <<"HTTP/1.1 500 Internal Server Error\r
> > \nServer: CouchDB/0.10.0 (Erlang OTP/R13B)\r\nDate: Wed, 04 Nov 2009
> > 11:12:25 GMT\r\nContent-Type: application/json\r\nContent-Length:  
> > 114\r
> > \nCache-Control: must-revalidate\r\n\r\n{\"error\":\"json_encode\",
> > \"reason\":
> > \"{bad_term, 
> > {26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}} 
> > \"}\n">>}
> > ** When Server state ==
> > {state,"10.1.1.31",5984,false,undefined,[],false,
> >                            #Port<0.2310>,
> >                            {[{request,
> >                                  {url,
> >
> > "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
> > 2Fredemption.intranet.struktur.de:2425?open_revs= 
> > [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
> >
> > "10.1.1.31",5984,undefined,undefined,
> >                                      "/gangstercluster_1/ping_http: 
> > %2F%
> > 2Fredemption.intranet.struktur.de:2425?open_revs= 
> > [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
> >                                      http},
> >                                  get,
> >                                  [{response_format,binary},
> >                                   {inactivity_timeout,30000}],
> >                                  {<0.2084.0>,#Ref<0.0.3.120029>},
> >                                  undefined,false,
> >                                  {1257,333145,499104},
> >
> > 1048576,false,undefined,undefined,binary}],
> >                             []},
> >                            {request,
> >                                {url,
> >
> > "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
> > 2Fredemption.intranet.struktur.de:2425?open_revs= 
> > [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
> >
> > "10.1.1.31",5984,undefined,undefined,
> >                                    "/gangstercluster_1/ping_http:%2F%
> > 2Fredemption.intranet.struktur.de:2425?open_revs= 
> > [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
> >                                    http},
> >                                get,
> >                                [{response_format,binary},
> >                                 {inactivity_timeout,30000}],
> >                                {<0.2084.0>,#Ref<0.0.3.120029>},
> >                                undefined,false,
> >                                {1257,333145,499104},
> >
> > 1048576,false,undefined,undefined,binary},
> >                            get_body,"200",<<"[">>,0,0,
> >                            [{"Transfer-Encoding","chunked"},
> >                             {"Server","CouchDB/0.10.0 (Erlang
> > OTP/R13B)"},
> >                             {"Date","Wed, 04 Nov 2009 11:12:25 GMT"},
> >                             {"Content-Type","application/json"},
> >                             {"Cache-Control","must-revalidate"}],
> >                            false,undefined,undefined,true,chunked,
> >                            chunk_start,<<>>,0,311334,1,undefined}
> > ** Reason for termination ==
> > ** {function_clause,[{ibrowse_http_client,to_ascii,"r"},
> >                     {ibrowse_http_client,hexlist_to_integer,3},
> >                     {ibrowse_http_client,parse_11_response,2},
> >                     {ibrowse_http_client,handle_sock_data,2},
> >                     {gen_server,handle_msg,5},
> >                     {proc_lib,init_p_do_apply,3}]}
> >
> >
> >
> > -- 
> > Simon Eisenmann
> >
> > [ mailto:simon@struktur.de ]
> >
> > [ struktur AG | Kronenstraße 22a | D-70173 Stuttgart ]
> > [ T. +49.711.896656.0 | F.+49.711.89665610 ]
> > [ http://www.struktur.de | mailto:info@struktur.de ]
> 
-- 
Simon Eisenmann

[ mailto:simon@struktur.de ]

[ struktur AG | Kronenstraße 22a | D-70173 Stuttgart ]
[ T. +49.711.896656.68 | F.+49.711.89665610 ]
[ http://www.struktur.de | mailto:info@struktur.de ]

Re: Continous Replication hangs with "start"

Posted by Adam Kocoloski <ko...@apache.org>.
Hi Simon, looks like replication is failing on the following URL:

http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%2Fredemption.intranet.struktur.de:2425?open_revs=[\ 
"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true

It gets back a response that looks like

HTTP/1.1 500 Internal Server Error
Server: CouchDB/0.10.0 (Erlang OTP/R13B)
Date: Wed, 04 Nov 2009 11:10:19 GMT
Content-Type: application/json
Content-Length: 114
Cache-Control: must-revalidate

{\"error\":\"json_encode\",\"reason\":\"{bad_term, 
{26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}}\"} 
\n"

You could try the URL for yourself and confirm this to be the case.   
The problem is probably that revision 26621-.... of this document is  
missing.  0.10.0 had a bug where if you explicitly requested a missing  
revision, you'd get a 500 Internal Server Response.  The replicator  
isn't in the habit of making these kinds of requests, but it's  
probably some interaction with the full disk problem.

I fixed this bug in r829919 (trunk) or r829924 (0.10.x branch, didn't  
make 0.10.0), so if you upgraded it would likely allow replication to  
proceed.  Otherwise, I think your recourse is to make some update to  
this document so that the replicator doesn't try to request this  
revision anymore.  Hope it helps,

Adam

On Nov 4, 2009, at 6:15 AM, Simon Eisenmann wrote:

> Hi,
>
> i have been running continuous replication for the past three weeks on
> three nodes each replicating to each other with continuous changes on
> every node. All running fine until one node crashed cause of a full
> disk.
>
> I emptied disk space and restarted this node. This replication never
> seems to start again. Status in futon keep saying "Starting".
>
> I tried to restart all the nodes with no luck, all having the same
> issues.
>
> See the errors in the couchdb.log below.
>
> I am running CouchDB 0.10.0 on Erlang R13B.
>
> I hope somebody has some sort of idea what is going on.
>
>
> Thanks
> Simon
>
>
> [Wed, 04 Nov 2009 11:10:19 GMT] [error] [<0.2114.0>] ** Generic server
> <0.2114.0> terminating
> ** Last message in was {tcp,#Port<0.2219>,
>                            <<"HTTP/1.1 500 Internal Server Error\r
> \nServer: CouchDB/0.10.0 (Erlang OTP/R13B)\r\nDate: Wed, 04 Nov 2009
> 11:10:19 GMT\r\nContent-Type: application/json\r\nContent-Length:  
> 114\r
> \nCache-Control: must-revalidate\r\n\r\n{\"error\":\"json_encode\",
> \"reason\":
> \"{bad_term, 
> {26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}} 
> \"}\n">>}
> ** When Server state ==
> {state,"10.1.1.31",5984,false,undefined,[],false,
>                            #Port<0.2219>,
>                            {[{request,
>                                  {url,
>
> "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
> 2Fredemption.intranet.struktur.de:2425?open_revs= 
> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>
> "10.1.1.31",5984,undefined,undefined,
>                                      "/gangstercluster_1/ping_http: 
> %2F%
> 2Fredemption.intranet.struktur.de:2425?open_revs= 
> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>                                      http},
>                                  get,
>                                  [{response_format,binary},
>                                   {inactivity_timeout,30000}],
>                                  {<0.2084.0>,#Ref<0.0.3.30137>},
>                                  undefined,false,
>                                  {1257,333019,329755},
>
> 1048576,false,undefined,undefined,binary}],
>                             []},
>                            {request,
>                                {url,
>
> "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
> 2Fredemption.intranet.struktur.de:2425?open_revs= 
> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>
> "10.1.1.31",5984,undefined,undefined,
>                                    "/gangstercluster_1/ping_http:%2F%
> 2Fredemption.intranet.struktur.de:2425?open_revs= 
> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>                                    http},
>                                get,
>                                [{response_format,binary},
>                                 {inactivity_timeout,30000}],
>                                {<0.2084.0>,#Ref<0.0.3.30137>},
>                                undefined,false,
>                                {1257,333019,329755},
>
> 1048576,false,undefined,undefined,binary},
>                            get_body,"200",<<"[">>,0,0,
>                            [{"Transfer-Encoding","chunked"},
>                             {"Server","CouchDB/0.10.0 (Erlang
> OTP/R13B)"},
>                             {"Date","Wed, 04 Nov 2009 11:10:19 GMT"},
>                             {"Content-Type","application/json"},
>                             {"Cache-Control","must-revalidate"}],
>                            false,undefined,undefined,true,chunked,
>                            chunk_start,<<>>,0,262182,1,undefined}
> ** Reason for termination ==
> ** {function_clause,[{ibrowse_http_client,to_ascii,"r"},
>                     {ibrowse_http_client,hexlist_to_integer,3},
>                     {ibrowse_http_client,parse_11_response,2},
>                     {ibrowse_http_client,handle_sock_data,2},
>                     {gen_server,handle_msg,5},
>                     {proc_lib,init_p_do_apply,3}]}
>
>
> [Wed, 04 Nov 2009 11:10:19 GMT] [error] [<0.2114.0>]
> {error_report,<0.24.0>,
>    {<0.2114.0>,crash_report,
>     [[{initial_call,{ibrowse_http_client,init,['Argument__1']}},
>       {pid,<0.2114.0>},
>       {registered_name,[]},
>       {error_info,
>           {exit,
>               {function_clause,
>                   [{ibrowse_http_client,to_ascii,"r"},
>                    {ibrowse_http_client,hexlist_to_integer,3},
>                    {ibrowse_http_client,parse_11_response,2},
>                    {ibrowse_http_client,handle_sock_data,2},
>                    {gen_server,handle_msg,5},
>                    {proc_lib,init_p_do_apply,3}]},
>
> [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
>       {ancestors,[<0.139.0>,ibrowse,<0.1.0>]},
>       {messages,[]},
>       {links,[<0.139.0>]},
>       {dictionary,
>           [{my_trace_flag,false},
>            {ibrowse_trace_token,["10.1.1.31",58,"5984"]},
>            {http_prot_vsn,"HTTP/1.1"},
>            {conn_close,"false"}]},
>       {trap_exit,false},
>       {status,running},
>       {heap_size,4181},
>       {stack_size,24},
>       {reductions,1570}],
>      []]}}
>
>
> [Wed, 04 Nov 2009 11:12:25 GMT] [error] [<0.2339.0>] ** Generic server
> <0.2339.0> terminating
> ** Last message in was {tcp,#Port<0.2310>,
>                            <<"HTTP/1.1 500 Internal Server Error\r
> \nServer: CouchDB/0.10.0 (Erlang OTP/R13B)\r\nDate: Wed, 04 Nov 2009
> 11:12:25 GMT\r\nContent-Type: application/json\r\nContent-Length:  
> 114\r
> \nCache-Control: must-revalidate\r\n\r\n{\"error\":\"json_encode\",
> \"reason\":
> \"{bad_term, 
> {26621,<<47,146,160,252,77,28,71,180,40,233,173,84,97,192,173,47>>}} 
> \"}\n">>}
> ** When Server state ==
> {state,"10.1.1.31",5984,false,undefined,[],false,
>                            #Port<0.2310>,
>                            {[{request,
>                                  {url,
>
> "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
> 2Fredemption.intranet.struktur.de:2425?open_revs= 
> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>
> "10.1.1.31",5984,undefined,undefined,
>                                      "/gangstercluster_1/ping_http: 
> %2F%
> 2Fredemption.intranet.struktur.de:2425?open_revs= 
> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>                                      http},
>                                  get,
>                                  [{response_format,binary},
>                                   {inactivity_timeout,30000}],
>                                  {<0.2084.0>,#Ref<0.0.3.120029>},
>                                  undefined,false,
>                                  {1257,333145,499104},
>
> 1048576,false,undefined,undefined,binary}],
>                             []},
>                            {request,
>                                {url,
>
> "http://10.1.1.31:5984/gangstercluster_1/ping_http:%2F%
> 2Fredemption.intranet.struktur.de:2425?open_revs= 
> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>
> "10.1.1.31",5984,undefined,undefined,
>                                    "/gangstercluster_1/ping_http:%2F%
> 2Fredemption.intranet.struktur.de:2425?open_revs= 
> [\"26621-2f92a0fc4d1c47b428e9ad5461c0ad2f\"]&revs=true&latest=true",
>                                    http},
>                                get,
>                                [{response_format,binary},
>                                 {inactivity_timeout,30000}],
>                                {<0.2084.0>,#Ref<0.0.3.120029>},
>                                undefined,false,
>                                {1257,333145,499104},
>
> 1048576,false,undefined,undefined,binary},
>                            get_body,"200",<<"[">>,0,0,
>                            [{"Transfer-Encoding","chunked"},
>                             {"Server","CouchDB/0.10.0 (Erlang
> OTP/R13B)"},
>                             {"Date","Wed, 04 Nov 2009 11:12:25 GMT"},
>                             {"Content-Type","application/json"},
>                             {"Cache-Control","must-revalidate"}],
>                            false,undefined,undefined,true,chunked,
>                            chunk_start,<<>>,0,311334,1,undefined}
> ** Reason for termination ==
> ** {function_clause,[{ibrowse_http_client,to_ascii,"r"},
>                     {ibrowse_http_client,hexlist_to_integer,3},
>                     {ibrowse_http_client,parse_11_response,2},
>                     {ibrowse_http_client,handle_sock_data,2},
>                     {gen_server,handle_msg,5},
>                     {proc_lib,init_p_do_apply,3}]}
>
>
>
> -- 
> Simon Eisenmann
>
> [ mailto:simon@struktur.de ]
>
> [ struktur AG | Kronenstraße 22a | D-70173 Stuttgart ]
> [ T. +49.711.896656.0 | F.+49.711.89665610 ]
> [ http://www.struktur.de | mailto:info@struktur.de ]