You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Sean Mackrory (JIRA)" <ji...@apache.org> on 2019/06/25 14:33:00 UTC

[jira] [Comment Edited] (HADOOP-15729) [s3a] stop treat fs.s3a.max.threads as the long-term minimum

    [ https://issues.apache.org/jira/browse/HADOOP-15729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16872396#comment-16872396 ] 

Sean Mackrory edited comment on HADOOP-15729 at 6/25/19 2:32 PM:
-----------------------------------------------------------------

Having slept on it, I think we should go ahead and undeprecate fs.s3a.threads.core (just saw in the code we had it once) but have it default to 0. A non-zero default is dangerous because if anyone has set max threads to 1 (and I know some people have set it to low values - albeit not THAT low - to work around the problem I described above) and suddenly core threads is, say, 5, you'll suddenly start getting exceptions on upgrade (or we'll have to handle that exception and rather quietly fail over to 0 core threads, which doesn't seem like expected behavior). But the more I think about this the less averse I am to making it tunable right away, especially since the property has existed before but has been unused for some time.

edit: it's actually only deprecated in branch-2 for compatibility reasons. In trunk, it's gone entirely. This happened at the time of the BlockingThreadPoolExecutor implementation. 


was (Author: mackrorysd):
Having slept on it, I think we should go ahead and undeprecate fs.s3a.threads.core (just saw in the code we had it once) but have it default to 0. A non-zero default is dangerous because if anyone has set max threads to 1 (and I know some people have set it to low values - albeit not THAT low - to work around the problem I described above) and suddenly core threads is, say, 5, you'll suddenly start getting exceptions on upgrade (or we'll have to handle that exception and rather quietly fail over to 0 core threads, which doesn't seem like expected behavior). But the more I think about this the less averse I am to making it tunable right away, especially since the property has existed before but has been unused for some time.

> [s3a] stop treat fs.s3a.max.threads as the long-term minimum
> ------------------------------------------------------------
>
>                 Key: HADOOP-15729
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15729
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>            Priority: Major
>         Attachments: HADOOP-15729.001.patch
>
>
> A while ago the s3a connector started experiencing deadlocks because the AWS SDK requires an unbounded threadpool. It places monitoring tasks on the work queue before the tasks they wait on, so it's possible (has even happened with larger-than-default threadpools) for the executor to become permanently saturated and deadlock.
> So we started giving an unbounded threadpool executor to the SDK, and using a bounded, blocking threadpool service for everything else S3A needs (although currently that's only in the S3ABlockOutputStream). fs.s3a.max.threads then only limits this threadpool, however we also specified fs.s3a.max.threads as the number of core threads in the unbounded threadpool, which in hindsight is pretty terrible.
> Currently those core threads do not timeout, so this is actually setting a sort of minimum. Once that many tasks have been submitted, the threadpool will be locked at that number until it bursts beyond that, but it will only spin down that far. If fs.s3a.max.threads is set reasonably high and someone uses a bunch of S3 buckets, they could easily have thousands of idle threads constantly.
> We should either not use fs.s3a.max.threads for the corepool size and introduce a new configuration, or we should simply allow core threads to timeout. I'm reading the OpenJDK source now to see what subtle differences there are between core threads and other threads if core threads can timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org