You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "ctubbsii (via GitHub)" <gi...@apache.org> on 2023/09/28 08:29:17 UTC

[GitHub] [accumulo] ctubbsii commented on issue #3762: Broken or Flaky test: ManagerRepairsDualAssignmentIT

ctubbsii commented on issue #3762:
URL: https://github.com/apache/accumulo/issues/3762#issuecomment-1738705809

   Upon further investigation, it looks like this was specifically caused by the attempt to mitigate the errors from Thrift's max message size issue, which would retry indefinitely. That mitigation was added in b1b2557f949e9212a1b1ca9b65f2d66c01a69edb for #3737 to address #3731.
   
   However, in this case, we actually do want to retry, because it's a transient failure, not a fatal one.
   
   The problem is that we cannot distinguish between the transient failures of the socket being closed because a server died, and the non-transient failures of hitting the Thrift max message size.
   
   Given that Accumulo is intended to be robust against transient network outages, and the Thrift max message size is mitigated with configuration to increase the size of the max message, I'm thinking we're going to need to revert the categorization of EOFException as being a fatal error that was done in b1b2557f949e9212a1b1ca9b65f2d66c01a69edb, because it's better that we not fail to resume a scan, just because of a transient network failure. @dlmarion , what do you think?
   
   Also, related: I'd like @keith-turner 's opinion on whether we should even be using the framed transport, which creates all these headaches for us in the first place. I'm also curious if @dlmarion or @keith-turner know whether the framed transport should be used for client sockets at all. From the javadoc on TTransportFactory, it reads as if the wrapping is really expected to only be used in servers. However, we use it to wrap client transports with it in our ThriftUtil class, and I'm not sure that's the right thing. I'm not even sure why we're using the framed transport at all.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@accumulo.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org