You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Harsh J (JIRA)" <ji...@apache.org> on 2013/02/07 17:53:12 UTC
[jira] [Created] (AVRO-1244) Provide a SeekableInput implementation
for FileSystem retrieved output streams
Harsh J created AVRO-1244:
-----------------------------
Summary: Provide a SeekableInput implementation for FileSystem retrieved output streams
Key: AVRO-1244
URL: https://issues.apache.org/jira/browse/AVRO-1244
Project: Avro
Issue Type: Improvement
Components: java
Reporter: Harsh J
Priority: Minor
To use the DFW#appendTo API, one needs to pass a SeekableInput interface object. Avro provides a usable utility for files that can be represented by a File object, but in the Hadoop land, HDFS and other FSes can't be represented via a File object and need a longer route to implement this interface.
We can add a simple HadoopSeekableFSInput or so that can take Hadoop provided objects and wrap it into a SeekableInput interface ready for passing to Avro.
I propose something of the following type:
{code}
public static class HadoopSeekableFSInput implements SeekableInput {
FSDataInputStream in;
long length;
public SeekableFSInput(FSDataInputStream in, long length) {
this.in = in;
this.length = length;
}
public void close() throws IOException {
in.close();
}
public void seek(long p) throws IOException {
in.seek(p);
}
public long tell() throws IOException {
return in.getPos();
}
public long length() throws IOException {
return length;
}
public int read(byte[] b, int off, int len) throws IOException {
return in.read(b, off, len);
}
}
{code}
The above can be constructed by users via a simple call such as {{new HadoopSeekableFSInput(fs.open(filePath), fs.getFileStatus(filePath).getLen())}}.
Ideally this class should belong in the avro core module but that strictly does not depend on Hadoop-Common today, and hence somewhere else may be more suitable.
This lets users write Avro-append code such as https://gist.github.com/QwertyManiac/4724582 more easily.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira