You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2018/10/23 12:40:00 UTC

[jira] [Commented] (HADOOP-15871) Some input streams does not obey "java.io.InputStream.available" contract

    [ https://issues.apache.org/jira/browse/HADOOP-15871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660580#comment-16660580 ] 

Steve Loughran commented on HADOOP-15871:
-----------------------------------------

Extensions to {{hadoop-common-project/hadoop-common/src/site/markdown/filesystem/fsdatainputstream.md}} welcome; with contract tests...could be bundled with any HADOOP-15870 changes.

workflow
# work out what is meant to happen @ java APIs
# look at HDFS to see what it thinks should happen
# spec
# contract tests
# test object stores & patch individually

looking at java.io, available() says "which won't block". For S3A it'd actually be the remaining amount of data in the current read, so just forward to {{wrappedStream.available()}} if wrappedStream != null, else 0. But {{com.amazonaws.services.s3.model.S3ObjectInputStream}}} calls out always returning 1 here so as not to break {{GZIPInputStream}} (see [https://bugs.java.com/bugdatabase/view_bug.do?bug_id=7036144].

'was this gzip related? If so, something to consider including in a test too.


> Some input streams does not obey "java.io.InputStream.available" contract 
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-15871
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15871
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, fs/s3
>            Reporter: Shixiong Zhu
>            Priority: Major
>
> E.g,  DFSInputStream  and S3AInputStream return the size of the remaining available bytes, but the javadoc of "available" says it should "Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream *without blocking* by the next invocation of a method for this input stream."
> I understand that some applications may rely on the current behavior. It would be great that there is an interface to document how "available" should be implemented.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org