You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Julian Foad <ju...@wandisco.com> on 2010/02/11 12:33:56 UTC

Serf hangs in a merge where Neon succeeds

I tried a dry-run merge, using a trunk@908933 client, and it hangs with
Serf, but completes in a few seconds with Neon. (When I say it hangs, an
identical test except using a trunk build from 2010-02-04 hung for 25
minutes before I interrupted it.)

The test case here uses a 1.6.x@902803 WC, clean except for a local mod
to the 'STATUS' file. The "##" lines are comments I typed while I was
waiting, to remind me when I issued the command.

[[[
$ svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803
--- Merging r891676 through r902803 into '.':
U    subversion/tests/cmdline/externals_tests.py
## Started ~10:49 ...
^C/home/julianfoad/src/subversion-a/subversion/svn/util.c:900: (apr_err=4)
/home/julianfoad/src/subversion-a/subversion/libsvn_client/merge.c:9460: (apr_err=4)
/home/julianfoad/src/subversion-a/subversion/libsvn_client/merge.c:8082: (apr_err=4)
/home/julianfoad/src/subversion-a/subversion/libsvn_client/merge.c:7753: (apr_err=4)
/home/julianfoad/src/subversion-a/subversion/libsvn_client/merge.c:4863: (apr_err=4)
/home/julianfoad/src/subversion-a/subversion/libsvn_ra_serf/update.c:2248: (apr_err=4)
svn: Error retrieving REPORT (4): Interrupted system call

$ svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=neon
--- Merging r891676 through r902803 into '.':
U    subversion/tests/cmdline/externals_tests.py
U    subversion/libsvn_client/commit_util.c

$ svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=neon
--- Merging r891676 through r902803 into '.':
U    subversion/tests/cmdline/externals_tests.py
U    subversion/libsvn_client/commit_util.c

$ svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=serf
--- Merging r891676 through r902803 into '.':
U    subversion/tests/cmdline/externals_tests.py
## Started 11:16 ...
]]]

Here's the stack trace if I interrupt it in GDB after about 10 seconds:
[[[
--- Merging r891676 through r902803 into '.':
U    subversion/tests/cmdline/externals_tests.py
^C
Program received signal SIGINT, Interrupt.
0x0040e422 in __kernel_vsyscall ()
(gdb) bt
#0  0x0040e422 in __kernel_vsyscall ()
#1  0x004dc156 in epoll_wait () from /lib/tls/i686/cmov/libc.so.6
#2  0x0061d3ad in apr_pollset_poll () from /usr/lib/libapr-1.so.0
#3  0x00d149e8 in serf_context_run () from /usr/lib/libserf-0.so.0
#4  0x00af2b9d in finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2237
#5  0x002eff90 in drive_merge_report_editor (
    target_abspath=0x81b5c70 "/home/julianfoad/src/subversion-1.6.x", 
    url1=0x80ad3d0 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x", revision1=891675, 
    url2=0x80ad448 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672", 
    revision2=902803, children_with_mergeinfo=0x9123318, depth=svn_depth_infinity, 
    notify_b=0xbfffede4, callbacks=0x31be60, merge_b=0xbfffed74, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/merge.c:4863
#6  0x002f5b76 in do_directory_merge (
    url1=0x80ad3d0 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x", revision1=891675, 
    url2=0x80ad448 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672", 
    revision2=902803, target_abspath=0x81b5c70 "/home/julianfoad/src/subversion-1.6.x", 
    depth=svn_depth_infinity, squelch_mergeinfo_notifications=0, notify_b=0xbfffede4, 
    merge_b=0xbfffed74, pool=0x9123270)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/merge.c:7746
#7  0x002f66cf in do_merge (merge_sources=0x80ac9f0, 
    target_abspath=0x81b5c70 "/home/julianfoad/src/subversion-1.6.x", sources_ancestral=1, 
    sources_related=1, same_repos=1, ignore_ancestry=0, force=0, dry_run=1, record_only=0, 
    reintegrate_merge=0, squelch_mergeinfo_notifications=0, depth=svn_depth_infinity, 
    merge_options=0x0, use_sleep=0xbfffef30, ctx=0x808fb60, pool=0x808f200)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/merge.c:8079
#8  0x002fa646 in svn_client_merge_peg3 (
    source=0x81b5968 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672", 
    ranges_to_merge=0x81b5b80, peg_revision=0xbfffefe0, target_wcpath=0x8073548 "", 
    depth=svn_depth_unknown, ignore_ancestry=0, force=0, record_only=0, dry_run=1, 
    merge_options=0x0, ctx=0x808fb60, pool=0x808f200)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/merge.c:9451
#9  0x0805b0b2 in svn_cl__merge (os=0x808f3a8, baton=0xbffff248, pool=0x808f200)
    at /home/julianfoad/src/subversion-b/subversion/svn/merge-cmd.c:322
#10 0x0805a53e in main (argc=5, argv=0xbffff394)
    at /home/julianfoad/src/subversion-b/subversion/svn/main.c:2249
]]]

If in GDB I step through the big loop in RA-serf's finish_report(),
pausing in each iteration, then it works [1]. If, however, I break at
the beginning of the loop, set a breakpoint at the end of the loop, and
let that loop run through at full speed, then it hangs in
serf_context_run() in some iteration of the loop. [2]

If I insert debug prints [3], I get:

[[[
## ra-serf finish_report() loop:
## done=0, fetches=0, propfinds=0, t=-694967296
## done=0, num_conns=2, cur_conn=1
## done=0, fetches=0, propfinds=0, t=-694967296
## done=0, num_conns=2, cur_conn=1
## done=0, fetches=0, propfinds=0, t=-694967296
## done=0, num_conns=2, cur_conn=1
## done=0, fetches=0, propfinds=0, t=-694967296
## done=0, num_conns=2, cur_conn=1
## done=0, fetches=2, propfinds=0, t=-694967296
## done=1, num_conns=2, cur_conn=1
## done=1, fetches=2, propfinds=0, t=-694967296
## done=1, num_conns=2, cur_conn=1
[... hundreds of repeats ...]
## done=1, fetches=2, propfinds=0, t=-694967296
## done=1, num_conns=2, cur_conn=1
## done=1, fetches=2, propfinds=0, t=-694967296
--- Merging r891676 through r902803 into '.':
U    subversion/tests/cmdline/externals_tests.py
## done=1, num_conns=2, cur_conn=1
## done=1, fetches=1, propfinds=0, t=-694967296
[... hangs here ...]
]]]

The timeout looks wrong, for a start. Could that be the problem?

My Serf is version 0.3.0.

- Julian


[1]
[[[
(gdb) run
Starting program: /home/julianfoad/build/subversion-b/subversion/svn/.libs/lt-svn merge --dry-run \^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=serf
[Thread debugging using libthread_db enabled]
[New Thread 0xb7eb2b70 (LWP 10080)]
[Thread 0xb7eb2b70 (LWP 10080) exited]

Breakpoint 2, finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2235
2235	  while (!report->done || report->active_fetches || report->active_propfinds)
(gdb) c
Continuing.

Breakpoint 3, finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2238
2238	      if (APR_STATUS_IS_TIMEUP(status))
(gdb) c
Continuing.

Breakpoint 3, finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2238
2238	      if (APR_STATUS_IS_TIMEUP(status))
(gdb) c
Continuing.

Breakpoint 3, finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2238
2238	      if (APR_STATUS_IS_TIMEUP(status))
(gdb) c
Continuing.

Breakpoint 3, finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2238
2238	      if (APR_STATUS_IS_TIMEUP(status))
(gdb) c
Continuing.

Breakpoint 3, finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2238
2238	      if (APR_STATUS_IS_TIMEUP(status))
(gdb) c
Continuing.

Breakpoint 3, finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2238
2238	      if (APR_STATUS_IS_TIMEUP(status))
(gdb) c
Continuing.
--- Merging r891676 through r902803 into '.':
U    subversion/tests/cmdline/externals_tests.py
U    subversion/libsvn_client/commit_util.c

Breakpoint 3, finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2238
2238	      if (APR_STATUS_IS_TIMEUP(status))
(gdb) c
Continuing.

Program exited normally.
]]]


[2]
[[[ 
Starting program: /home/julianfoad/build/subversion-b/subversion/svn/.libs/lt-svn merge --dry-run \^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=serf
[Thread debugging using libthread_db enabled]
[New Thread 0xb7eb2b70 (LWP 31900)]
[Thread 0xb7eb2b70 (LWP 31900) exited]

Breakpoint 2, finish_report (report_baton=0x966f148, pool=0x966ef30)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2235
2235	  while (!report->done || report->active_fetches || report->active_propfinds)
(gdb) b 2382
Breakpoint 4 at 0xf4dfb8: file /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c, line 2382.
(gdb) c
Continuing.
--- Merging r891676 through r902803 into '.':
U    subversion/tests/cmdline/externals_tests.py
^C
]]]


[3]
[[[
  printf("## ra-serf finish_report() loop:\n");
  while (!report->done || report->active_fetches || report->active_propfinds)
    {
      printf("## done=%d, fetches=%u, propfinds=%u, t=%ld\n", report->done, report->active_fetches, report->active_propfinds, sess->timeout); fflush(stdout);
      status = serf_context_run(sess->context, sess->timeout, pool);
      if (APR_STATUS_IS_TIMEUP(status))
        {
          return svn_error_create(SVN_ERR_RA_DAV_CONN_TIMEOUT,
                                  NULL,
                                  _("Connection timed out"));
        }
      if (status)
        {
          SVN_ERR(sess->pending_error);

          return svn_error_wrap_apr(status, _("Error retrieving REPORT (%d)"),
                                    status);
        }

      printf("## done=%d, num_conns=%d, cur_conn=%d\n", (int)report->done, (int)sess->num_conns, (int)sess->cur_conn); fflush(stdout);
      /* Open extra connections if we have enough requests to send. */
      if (sess->num_conns < MAX_NR_OF_CONNS)
        SVN_ERR(open_connection_if_needed(sess, report->active_fetches +
                                          report->active_propfinds));

      /* Switch our connection. */
      [...]
]]]


Re: [serf-dev] Re: Serf hangs in a merge where Neon succeeds

Posted by Lieven Govaerts <sv...@mobsol.be>.
On Thu, Feb 11, 2010 at 5:33 PM, Lieven Govaerts <sv...@mobsol.be> wrote:
> On Thu, Feb 11, 2010 at 5:13 PM, Justin Erenkrantz
> <ju...@erenkrantz.com> wrote:
>> On Thu, Feb 11, 2010 at 7:48 AM, Julian Foad <ju...@wandisco.com> wrote:
>>> OK, tried with branches/0.3.x@1309.
>>>
>>> That works fine.
>>
>> Sounds like we should cut serf 0.3.1.
>>
>> Does anyone have anything outstanding to bring into 0.3.x?  If not, I
>> can do a release soon.  -- justin
>>
>
> There have been some fixes in the build scripts and a few bugfixes on
> trunk that we should take along in 0.3.1.
>
> I can merge them to the 0.3.x branch tonight and run some tests.

Merged the important changes to 0.3.x. I'm not really sure about
r1285,r1296,r1255, I'll leave these up to you to decide on.

Tests on Ubuntu ran fine.

Lieven

Re: [serf-dev] Re: Serf hangs in a merge where Neon succeeds

Posted by Lieven Govaerts <sv...@mobsol.be>.
On Thu, Feb 11, 2010 at 5:13 PM, Justin Erenkrantz
<ju...@erenkrantz.com> wrote:
> On Thu, Feb 11, 2010 at 7:48 AM, Julian Foad <ju...@wandisco.com> wrote:
>> OK, tried with branches/0.3.x@1309.
>>
>> That works fine.
>
> Sounds like we should cut serf 0.3.1.
>
> Does anyone have anything outstanding to bring into 0.3.x?  If not, I
> can do a release soon.  -- justin
>

There have been some fixes in the build scripts and a few bugfixes on
trunk that we should take along in 0.3.1.

I can merge them to the 0.3.x branch tonight and run some tests.

Lieven

Re: Serf hangs in a merge where Neon succeeds

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
On Thu, Feb 11, 2010 at 7:48 AM, Julian Foad <ju...@wandisco.com> wrote:
> OK, tried with branches/0.3.x@1309.
>
> That works fine.

Sounds like we should cut serf 0.3.1.

Does anyone have anything outstanding to bring into 0.3.x?  If not, I
can do a release soon.  -- justin

Re: Serf hangs in a merge where Neon succeeds

Posted by Julian Foad <ju...@wandisco.com>.
On Thu, 2010-02-11, Lieven Govaerts wrote:
> On Thu, Feb 11, 2010 at 2:41 PM, Julian Foad <ju...@btopenworld.com> wrote:
> > Lieven Govaerts wrote:
> >> Can you try with serf branch 0.3.x? I fixed a connection cleanup bug
> >> some time ago that caused serf to hang in apr_pollset_poll.
> >>
> >> http://serf.googlecode.com/svn/branches/0.3.x/
> >
> > OK, I tried with Serf trunk r1309. Seg fault:
> >
> 
> The serf api has changed in trunk, you'll need to do a full rebuild.
> Note that I proposed you the test the 0.3.x branch, not trunk.

OK, tried with branches/0.3.x@1309.

That works fine.

Thanks.
- Julian


Re: Serf hangs in a merge where Neon succeeds

Posted by Julian Foad <ju...@btopenworld.com>.
On Thu, 2010-02-11 at 15:03 +0100, Lieven Govaerts wrote:
> On Thu, Feb 11, 2010 at 2:41 PM, Julian Foad <ju...@btopenworld.com> wrote:
> > Lieven Govaerts wrote:
> >> Can you try with serf branch 0.3.x? I fixed a connection cleanup bug
> >> some time ago that caused serf to hang in apr_pollset_poll.
> >>
> >> http://serf.googlecode.com/svn/branches/0.3.x/
> >
> > OK, I tried with Serf trunk r1309. Seg fault:

> The serf api has changed in trunk, you'll need to do a full rebuild.
> Note that I proposed you the test the 0.3.x branch, not trunk.

Oops, sorry - I missed that.

- Julian


Re: Serf hangs in a merge where Neon succeeds

Posted by Lieven Govaerts <sv...@mobsol.be>.
On Thu, Feb 11, 2010 at 2:41 PM, Julian Foad <ju...@btopenworld.com> wrote:
> Lieven Govaerts wrote:
>> Can you try with serf branch 0.3.x? I fixed a connection cleanup bug
>> some time ago that caused serf to hang in apr_pollset_poll.
>>
>> http://serf.googlecode.com/svn/branches/0.3.x/
>
> OK, I tried with Serf trunk r1309. Seg fault:
>

The serf api has changed in trunk, you'll need to do a full rebuild.
Note that I proposed you the test the 0.3.x branch, not trunk.

[..]

Lieven

Re: Serf hangs in a merge where Neon succeeds

Posted by Julian Foad <ju...@btopenworld.com>.
Lieven Govaerts wrote:
> Can you try with serf branch 0.3.x? I fixed a connection cleanup bug
> some time ago that caused serf to hang in apr_pollset_poll.
> 
> http://serf.googlecode.com/svn/branches/0.3.x/

OK, I tried with Serf trunk r1309. Seg fault:

[[[
$ LD_LIBRARY_PATH=/home/julianfoad/local/serf-trunk gsvn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803
Reading symbols from /home/julianfoad/build/subversion-b/subversion/svn/.libs/lt-svn...done.
Temporary breakpoint 1 at 0x8057804: file /home/julianfoad/src/subversion-b/subversion/svn/main.c, line 1209.
Starting program: /home/julianfoad/build/subversion-b/subversion/svn/.libs/lt-svn merge --dry-run \^/subversion/branches/1.6.x-r891672/@902803
[Thread debugging using libthread_db enabled]

Temporary breakpoint 1, main (argc=4, argv=0xbffff414)
    at /home/julianfoad/src/subversion-b/subversion/svn/main.c:1209
1209	  svn_cl__opt_state_t opt_state = { 0, { 0 } };
(gdb) c
Continuing.
[New Thread 0xb7eb2b70 (LWP 8872)]
[Thread 0xb7eb2b70 (LWP 8872) exited]

Program received signal SIGSEGV, Segmentation fault.
0x003c51af in svn_ra_serf__conn_setup (sock=0x95de4d8, baton=0x95e04e4, pool=0xbfffeb48)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/util.c:282
282	  rb = serf_context_bucket_socket_create(conn->session->context,
(gdb) l
277	{
278	#endif
279	  serf_bucket_t *rb = NULL;
280	  svn_ra_serf__connection_t *conn = baton;
281	
282	  rb = serf_context_bucket_socket_create(conn->session->context,
283	                                         sock, conn->bkt_alloc);
284	
285	  if (conn->using_ssl)
286	    {
(gdb) p conn
$1 = (svn_ra_serf__connection_t *) 0x95e04e4
(gdb) p conn->session 
$2 = (svn_ra_serf__session_t *) 0x0
(gdb) bt
#0  0x003c51af in svn_ra_serf__conn_setup (sock=0x95de4d8, baton=0x95e04e4, pool=0xbfffeb48)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/util.c:282
#1  0x00371c3d in do_conn_setup () from /home/julianfoad/local/serf-trunk/libserf-0.so.0
#2  0x00371e94 in write_to_connection () from /home/julianfoad/local/serf-trunk/libserf-0.so.0
#3  0x0037263f in serf__process_connection () from /home/julianfoad/local/serf-trunk/libserf-0.so.0
#4  0x00370ab0 in serf_event_trigger () from /home/julianfoad/local/serf-trunk/libserf-0.so.0
#5  0x00370c14 in serf_context_run () from /home/julianfoad/local/serf-trunk/libserf-0.so.0
#6  0x003c5908 in svn_ra_serf__context_run_wait (done=0x9125004, sess=0x9124ee8, pool=0x9124e00)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/util.c:582
#7  0x003b8624 in svn_ra_serf__exchange_capabilities (serf_sess=0x9124ee8, pool=0x9124e00)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/options.c:475
#8  0x003bd6f0 in svn_ra_serf__open (session=0x9124ed8, 
    repos_URL=0x8210580 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672", 
    callbacks=0x9124e40, callback_baton=0x9124e68, config=0x808fba8, pool=0x9124e00)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/serf.c:455
#9  0x00d2e067 in svn_ra_open3 (session_p=0xbfffefb8, 
    repos_URL=0x8210580 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672", 
    uuid=0x0, callbacks=0x9124e40, callback_baton=0x9124e68, config=0x808fba8, pool=0x9124e00)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra/ra_loader.c:485
#10 0x00654a50 in svn_client__open_ra_session_internal (ra_session=0xbfffefb8, 
    base_url=0x8210580 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672", 
    base_dir=0x0, commit_items=0x0, use_admin=0, read_only_wc=1, ctx=0x808fab0, pool=0x9124e00)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/ra.c:332
#11 0x006483c0 in svn_client_merge_peg3 (
    source=0x80a43a0 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672", 
    ranges_to_merge=0x80a45b8, peg_revision=0xbffff060, target_wcpath=0x8073548 "", 
    depth=svn_depth_unknown, ignore_ancestry=0, force=0, record_only=0, dry_run=1, 
    merge_options=0x0, ctx=0x808fab0, pool=0x808f200)
    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/merge.c:9423
#12 0x0805b0b2 in svn_cl__merge (os=0x808f3a8, baton=0xbffff2c8, pool=0x808f200)
    at /home/julianfoad/src/subversion-b/subversion/svn/merge-cmd.c:322
#13 0x0805a53e in main (argc=4, argv=0xbffff414)
    at /home/julianfoad/src/subversion-b/subversion/svn/main.c:2249
]]]

- Julian


Re: Serf hangs in a merge where Neon succeeds

Posted by Lieven Govaerts <sv...@mobsol.be>.
On Thu, Feb 11, 2010 at 1:33 PM, Julian Foad <ju...@wandisco.com> wrote:
> I tried a dry-run merge, using a trunk@908933 client, and it hangs with
> Serf, but completes in a few seconds with Neon. (When I say it hangs, an
> identical test except using a trunk build from 2010-02-04 hung for 25
> minutes before I interrupted it.)
>
> The test case here uses a 1.6.x@902803 WC, clean except for a local mod
> to the 'STATUS' file. The "##" lines are comments I typed while I was
> waiting, to remind me when I issued the command.
>
> [[[
> $ svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803
> --- Merging r891676 through r902803 into '.':
> U    subversion/tests/cmdline/externals_tests.py
> ## Started ~10:49 ...
> ^C/home/julianfoad/src/subversion-a/subversion/svn/util.c:900: (apr_err=4)
> /home/julianfoad/src/subversion-a/subversion/libsvn_client/merge.c:9460: (apr_err=4)
> /home/julianfoad/src/subversion-a/subversion/libsvn_client/merge.c:8082: (apr_err=4)
> /home/julianfoad/src/subversion-a/subversion/libsvn_client/merge.c:7753: (apr_err=4)
> /home/julianfoad/src/subversion-a/subversion/libsvn_client/merge.c:4863: (apr_err=4)
> /home/julianfoad/src/subversion-a/subversion/libsvn_ra_serf/update.c:2248: (apr_err=4)
> svn: Error retrieving REPORT (4): Interrupted system call
>
> $ svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=neon
> --- Merging r891676 through r902803 into '.':
> U    subversion/tests/cmdline/externals_tests.py
> U    subversion/libsvn_client/commit_util.c
>
> $ svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=neon
> --- Merging r891676 through r902803 into '.':
> U    subversion/tests/cmdline/externals_tests.py
> U    subversion/libsvn_client/commit_util.c
>
> $ svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=serf
> --- Merging r891676 through r902803 into '.':
> U    subversion/tests/cmdline/externals_tests.py
> ## Started 11:16 ...
> ]]]
>
> Here's the stack trace if I interrupt it in GDB after about 10 seconds:
> [[[
> --- Merging r891676 through r902803 into '.':
> U    subversion/tests/cmdline/externals_tests.py
> ^C
> Program received signal SIGINT, Interrupt.
> 0x0040e422 in __kernel_vsyscall ()
> (gdb) bt
> #0  0x0040e422 in __kernel_vsyscall ()
> #1  0x004dc156 in epoll_wait () from /lib/tls/i686/cmov/libc.so.6
> #2  0x0061d3ad in apr_pollset_poll () from /usr/lib/libapr-1.so.0
> #3  0x00d149e8 in serf_context_run () from /usr/lib/libserf-0.so.0
> #4  0x00af2b9d in finish_report (report_baton=0x966f148, pool=0x966ef30)
>    at /home/julianfoad/src/subversion-b/subversion/libsvn_ra_serf/update.c:2237
> #5  0x002eff90 in drive_merge_report_editor (
>    target_abspath=0x81b5c70 "/home/julianfoad/src/subversion-1.6.x",
>    url1=0x80ad3d0 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x", revision1=891675,
>    url2=0x80ad448 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672",
>    revision2=902803, children_with_mergeinfo=0x9123318, depth=svn_depth_infinity,
>    notify_b=0xbfffede4, callbacks=0x31be60, merge_b=0xbfffed74, pool=0x966ef30)
>    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/merge.c:4863
> #6  0x002f5b76 in do_directory_merge (
>    url1=0x80ad3d0 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x", revision1=891675,
>    url2=0x80ad448 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672",
>    revision2=902803, target_abspath=0x81b5c70 "/home/julianfoad/src/subversion-1.6.x",
>    depth=svn_depth_infinity, squelch_mergeinfo_notifications=0, notify_b=0xbfffede4,
>    merge_b=0xbfffed74, pool=0x9123270)
>    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/merge.c:7746
> #7  0x002f66cf in do_merge (merge_sources=0x80ac9f0,
>    target_abspath=0x81b5c70 "/home/julianfoad/src/subversion-1.6.x", sources_ancestral=1,
>    sources_related=1, same_repos=1, ignore_ancestry=0, force=0, dry_run=1, record_only=0,
>    reintegrate_merge=0, squelch_mergeinfo_notifications=0, depth=svn_depth_infinity,
>    merge_options=0x0, use_sleep=0xbfffef30, ctx=0x808fb60, pool=0x808f200)
>    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/merge.c:8079
> #8  0x002fa646 in svn_client_merge_peg3 (
>    source=0x81b5968 "https://svn.apache.org/repos/asf/subversion/branches/1.6.x-r891672",
>    ranges_to_merge=0x81b5b80, peg_revision=0xbfffefe0, target_wcpath=0x8073548 "",
>    depth=svn_depth_unknown, ignore_ancestry=0, force=0, record_only=0, dry_run=1,
>    merge_options=0x0, ctx=0x808fb60, pool=0x808f200)
>    at /home/julianfoad/src/subversion-b/subversion/libsvn_client/merge.c:9451
> #9  0x0805b0b2 in svn_cl__merge (os=0x808f3a8, baton=0xbffff248, pool=0x808f200)
>    at /home/julianfoad/src/subversion-b/subversion/svn/merge-cmd.c:322
> #10 0x0805a53e in main (argc=5, argv=0xbffff394)
>    at /home/julianfoad/src/subversion-b/subversion/svn/main.c:2249
> ]]]
>
> If in GDB I step through the big loop in RA-serf's finish_report(),
> pausing in each iteration, then it works [1]. If, however, I break at
> the beginning of the loop, set a breakpoint at the end of the loop, and
> let that loop run through at full speed, then it hangs in
> serf_context_run() in some iteration of the loop. [2]
>
> If I insert debug prints [3], I get:
>
> [[[
> ## ra-serf finish_report() loop:
> ## done=0, fetches=0, propfinds=0, t=-694967296
> ## done=0, num_conns=2, cur_conn=1
> ## done=0, fetches=0, propfinds=0, t=-694967296
> ## done=0, num_conns=2, cur_conn=1
> ## done=0, fetches=0, propfinds=0, t=-694967296
> ## done=0, num_conns=2, cur_conn=1
> ## done=0, fetches=0, propfinds=0, t=-694967296
> ## done=0, num_conns=2, cur_conn=1
> ## done=0, fetches=2, propfinds=0, t=-694967296
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=2, propfinds=0, t=-694967296
> ## done=1, num_conns=2, cur_conn=1
> [... hundreds of repeats ...]
> ## done=1, fetches=2, propfinds=0, t=-694967296
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=2, propfinds=0, t=-694967296
> --- Merging r891676 through r902803 into '.':
> U    subversion/tests/cmdline/externals_tests.py
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=1, propfinds=0, t=-694967296
> [... hangs here ...]
> ]]]
>
> The timeout looks wrong, for a start. Could that be the problem?
>
> My Serf is version 0.3.0.
>
[..]

Can you try with serf branch 0.3.x? I fixed a connection cleanup bug
some time ago that caused serf to hang in apr_pollset_poll.

http://serf.googlecode.com/svn/branches/0.3.x/

Lieven

Re: Serf hangs in a merge where Neon succeeds

Posted by Julian Foad <ju...@wandisco.com>.
On Thu, 2010-02-11 at 14:10 +0000, Philip Martin wrote:
> Julian Foad <ju...@wandisco.com> writes:
> 
> > ## done=1, fetches=1, propfinds=0, t=-694967296
> > [... hangs here ...]
> > ]]]
> >
> > The timeout looks wrong, for a start. Could that be the problem?
> >
> > My Serf is version 0.3.0.
> 
> On my Debian machine using trunk@908909 and serf 0.3.0 the timeout is
> positive and it doesn't hang:
> 
> --- Merging r891676 through r902803 into '1.6.x':
> U    1.6.x/subversion/tests/cmdline/externals_tests.py
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=1, propfinds=0, t=3600000000
> U    1.6.x/subversion/libsvn_client/commit_util.c
> ## done=1, num_conns=2, cur_conn=1
> $
> 
> I note that 3600000000 + 694967296 = 2^32.  Are you using a 32-bit OS?

Yes I'm using a 32-bit OS.

> Have you specified a very big http-timeout value in your servers file?

No timeouts are specified in my .subversion/servers file. The timeout
lines are all commented out.

I think the timeout is not coming into play in this bug, and is just
another unrelated bug.

- Julian


Re: Serf hangs in a merge where Neon succeeds

Posted by Lieven Govaerts <sv...@mobsol.be>.
On Thu, Feb 11, 2010 at 3:10 PM, Philip Martin
<ph...@wandisco.com> wrote:
> Julian Foad <ju...@wandisco.com> writes:
>
>> ## done=0, fetches=2, propfinds=0, t=-694967296
>> ## done=1, num_conns=2, cur_conn=1
>> ## done=1, fetches=2, propfinds=0, t=-694967296
>> ## done=1, num_conns=2, cur_conn=1
>> [... hundreds of repeats ...]
>> ## done=1, fetches=2, propfinds=0, t=-694967296
>> ## done=1, num_conns=2, cur_conn=1
>> ## done=1, fetches=2, propfinds=0, t=-694967296
>> --- Merging r891676 through r902803 into '.':
>> U    subversion/tests/cmdline/externals_tests.py
>> ## done=1, num_conns=2, cur_conn=1
>> ## done=1, fetches=1, propfinds=0, t=-694967296
>> [... hangs here ...]
>> ]]]
>>
>> The timeout looks wrong, for a start. Could that be the problem?
>>
>> My Serf is version 0.3.0.
>
> On my Debian machine using trunk@908909 and serf 0.3.0 the timeout is
> positive and it doesn't hang:
>
> $ subversion/svn/svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=serf
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=2, propfinds=0, t=3600000000
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=2, propfinds=0, t=3600000000
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=2, propfinds=0, t=3600000000
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=2, propfinds=0, t=3600000000
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=2, propfinds=0, t=3600000000
> --- Merging r891676 through r902803 into '1.6.x':
> U    1.6.x/subversion/tests/cmdline/externals_tests.py
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=1, propfinds=0, t=3600000000
> U    1.6.x/subversion/libsvn_client/commit_util.c
> ## done=1, num_conns=2, cur_conn=1
> $
>
> I note that 3600000000 + 694967296 = 2^32.  Are you using a 32-bit OS?
> Have you specified a very big http-timeout value in your servers file?
>
I think that's what serf uses as default timeout.

Lieven

Re: Serf hangs in a merge where Neon succeeds

Posted by Philip Martin <ph...@wandisco.com>.
Julian Foad <ju...@wandisco.com> writes:

> ## done=0, fetches=2, propfinds=0, t=-694967296
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=2, propfinds=0, t=-694967296
> ## done=1, num_conns=2, cur_conn=1
> [... hundreds of repeats ...]
> ## done=1, fetches=2, propfinds=0, t=-694967296
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=2, propfinds=0, t=-694967296
> --- Merging r891676 through r902803 into '.':
> U    subversion/tests/cmdline/externals_tests.py
> ## done=1, num_conns=2, cur_conn=1
> ## done=1, fetches=1, propfinds=0, t=-694967296
> [... hangs here ...]
> ]]]
>
> The timeout looks wrong, for a start. Could that be the problem?
>
> My Serf is version 0.3.0.

On my Debian machine using trunk@908909 and serf 0.3.0 the timeout is
positive and it doesn't hang:

$ subversion/svn/svn merge --dry-run ^/subversion/branches/1.6.x-r891672/@902803 --config-option=servers:global:http-library=serf
## done=1, num_conns=2, cur_conn=1
## done=1, fetches=2, propfinds=0, t=3600000000
## done=1, num_conns=2, cur_conn=1
## done=1, fetches=2, propfinds=0, t=3600000000
## done=1, num_conns=2, cur_conn=1
## done=1, fetches=2, propfinds=0, t=3600000000
## done=1, num_conns=2, cur_conn=1
## done=1, fetches=2, propfinds=0, t=3600000000
## done=1, num_conns=2, cur_conn=1
## done=1, fetches=2, propfinds=0, t=3600000000
--- Merging r891676 through r902803 into '1.6.x':
U    1.6.x/subversion/tests/cmdline/externals_tests.py
## done=1, num_conns=2, cur_conn=1
## done=1, fetches=1, propfinds=0, t=3600000000
U    1.6.x/subversion/libsvn_client/commit_util.c
## done=1, num_conns=2, cur_conn=1
$

I note that 3600000000 + 694967296 = 2^32.  Are you using a 32-bit OS?
Have you specified a very big http-timeout value in your servers file?

-- 
Philip