You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Jason Lowe (JIRA)" <ji...@apache.org> on 2017/08/21 16:12:00 UTC

[jira] [Commented] (HADOOP-13139) Branch-2: S3a to use thread pool that blocks clients

    [ https://issues.apache.org/jira/browse/HADOOP-13139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16135370#comment-16135370 ] 

Jason Lowe commented on HADOOP-13139:
-------------------------------------

Had a user that ran into this on one of our clusters that upgraded to 2.8.  They were running a pre-2.8 version of the S3AFileSystem code with their job and it failed like this:
{noformat}
java.lang.IllegalArgumentException
	at java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1307)
	at java.util.concurrent.ThreadPoolExecutor.<init>(ThreadPoolExecutor.java:1230)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:280)
	at com.yahoo.prism.UseLocalKeyS3AFileSystem.initializeFileSystem(UseLocalKeyS3AFileSystem.java:68)
	at com.yahoo.prism.UseLocalKeyS3AFileSystem.initialize(UseLocalKeyS3AFileSystem.java:113)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2670)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:95)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2704)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2686)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:374)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
[...]
{noformat}

The problem is that core-default in 2.8 removed fs.s3a.threads.core but changed the existing fs.s3a.threads.max to 10.  The old pre-2.8 S3AFileSystem code had code defaults of 15 and 256, respectively.  So when a 2.8 job client (in this case an Oozie server) submits the job, picking up the 2.8 core-default settings for fs.s3a.threads.max for job.xml but the job itself runs with the older S3AFileSystem code the job fails because it tries to initialize a threadpool with core threads=15 and max threads=10.

Not sure if this is considered simply an invalid setup, but I suspect this won't be the first case of someone submitting a job with a 2.8 or later client (e.g.: via an Oozie server upgraded independently of a user's job code) and failing because the user hasn't upgraded to the 2.8 or later S3AFileSystem code yet.

If we had added a deprecated core-default value for fs.s3a.threads.core then the older code would have gotten consistent values for core and max threads.  As it is now, it gets half of the new default settings, and those aren't compatible with the older, other half of the defaults.  Thoughts on whether this is worth doing in a followup JIRA?

> Branch-2: S3a to use thread pool that blocks clients
> ----------------------------------------------------
>
>                 Key: HADOOP-13139
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13139
>             Project: Hadoop Common
>          Issue Type: Task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Pieter Reuse
>            Assignee: Pieter Reuse
>             Fix For: 2.8.0
>
>         Attachments: HADOOP-13139-001.patch, HADOOP-13139-branch-2.001.patch, HADOOP-13139-branch-2.002.patch, HADOOP-13139-branch-2-003.patch, HADOOP-13139-branch-2-004.patch, HADOOP-13139-branch-2-005.patch, HADOOP-13139-branch-2-006.patch
>
>
> HADOOP-11684 is accepted into trunk, but was not applied to branch-2. I will attach a patch applicable to branch-2.
> It should be noted in CHANGES-2.8.0.txt that the config parameter 'fs.s3a.threads.core' has been been removed and the behavior of the ThreadPool for s3a has been changed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org