You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/01 03:28:35 UTC

[GitHub] [arrow] cyb70289 commented on pull request #12442: ARROW-15706: [C++][FlightRPC] Implement a UCX transport

cyb70289 commented on pull request #12442:
URL: https://github.com/apache/arrow/pull/12442#issuecomment-1085377998


   For the "unknown address 0" ucx error, looks it's related to rdma network devices plugged in my test machine.
   I spawn a clean VM for test, there's no such error.
   
   Setting a breakpoint where the error is printed https://github.com/openucx/ucx/blob/v1.12.0/src/ucs/sys/sock.c#L660
   Interestingly, when the bp is fired, printing `addr->sa_family`, the value is 2 (AF_INET), logically impossible.
   Looks like `addr` is pointing to some volatile memory that's changed by other threads or hardware in parallel.
   
   `addr` is get by calling `rdma_get_local_addr` at https://github.com/openucx/ucx/blob/v1.12.0/src/uct/ib/rdmacm/rdmacm_cm_ep.c#L176
   
   From man page: https://linux.die.net/man/3/rdma_get_local_addr
   `rdma_get_local_addr` returns all zero if rdma nic is not bounded to an address. I do have some rdma nics disabled. The error looks harmless. Though it doesn't explain the strange behaviour found in the debugger.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org