You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2017/08/14 10:47:01 UTC

[jira] [Commented] (HADOOP-14770) S3A http connection in s3a driver not reuse in Spark application

    [ https://issues.apache.org/jira/browse/HADOOP-14770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16125524#comment-16125524 ] 

Steve Loughran commented on HADOOP-14770:
-----------------------------------------

# add the Hadoop version to the JIRA, thanks
# What is the file format? simple or columnar (ORC, Parquet)
# Looks like the connection is being closed on every seek, which is a sign of HADOOP-13203 not engaging (random IO), or on a sequential read, forward reads aborting/reopening rather than skipping forward.

Make sure you are using the Hadoop 2.8.x JARS, then:

For columnar data: enabling random IO.

{code}
spark.hadoop.fs.s3a.experimental.fadvise=random
{code}

For sequential data with big forward skips

{code}
spark.hadoop.fs.s3a.readahead.range = 768K
{code}

If this fixes it, close as a duplicate of HADOOP-13203
If this doesn't fix it, you can print both the input stream and s3a FS, as their toString() ops print all their stats.

Oh, one more possible cause: split calculation isn't getting it write. Look at your s3a block size, and the format itself.



> S3A http connection in s3a driver not reuse in Spark application
> ----------------------------------------------------------------
>
>                 Key: HADOOP-14770
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14770
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Yonger
>            Assignee: Yonger
>
> I print out connection stats every 2 s when running Spark application against s3-compatible storage:
> ESTAB      0      0         ::ffff:10.0.2.36:44446                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44454                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44374                ::ffff:10.0.2.254:80                 
> ESTAB      159724 0         ::ffff:10.0.2.36:44436                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44448                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44338                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44438                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44414                ::ffff:10.0.2.254:80                 
> ESTAB      0      480       ::ffff:10.0.2.36:44450                ::ffff:10.0.2.254:80                  timer:(on,170ms,0)
> ESTAB      0      0         ::ffff:10.0.2.36:44442                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44390                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44326                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44452                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44394                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44444                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44456                ::ffff:10.0.2.254:80                 
> ======================
> ESTAB      0      0         ::ffff:10.0.2.36:44508                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44476                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44524                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44374                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44500                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44504                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44512                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44506                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44464                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44518                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44510                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44442                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44526                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44472                ::ffff:10.0.2.254:80                 
> ESTAB      0      0         ::ffff:10.0.2.36:44466                ::ffff:10.0.2.254:80 
> the connection in the above of "=" and below were changed all the time. But this haven't seen in MR application. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org