You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Mohit Anchlia <mo...@gmail.com> on 2012/11/13 00:16:12 UTC

Reading from sequence file using java FS api

I am looking for an example that read snappy compressed snappy file. Could
someone point me to it? What I have so far is this:


Configuration conf = *new* Configuration();

FileSystem fs = FileSystem.*get*(URI.*create*(uri), conf);

Path path = *new* Path(uri);

SequenceFile.Reader reader = *null*;

org.apache.hadoop.io.LongWritable key =
*new*org.apache.hadoop.io.LongWritable();

org.apache.hadoop.io.Text value = *new* org.apache.hadoop.io.Text();

*try* {

reader = *new* SequenceFile.Reader(fs, path, conf);

Re: Reading from sequence file using java FS api

Posted by Harsh J <ha...@cloudera.com>.
Yes, the codec information is stored in the file's header.

Same goes for Avro where even the deserialization schema logic is
stored in addition, so you can just directly read into usable
primitive/compound objects and not have to do manual transformation
work.

On Tue, Nov 13, 2012 at 6:07 AM, Mohit Anchlia <mo...@gmail.com> wrote:
> I was simple able to read using below code. Didn't have to decompress. It
> looks like reader automatically knows and decompresses the file before
> returning it to the user.
>
>
> On Mon, Nov 12, 2012 at 3:16 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
>>
>> I am looking for an example that read snappy compressed snappy file. Could
>> someone point me to it? What I have so far is this:
>>
>>
>> Configuration conf =
>>
>> new Configuration();
>>
>> FileSystem fs = FileSystem.get(URI.create(uri), conf);
>>
>> Path path =
>>
>> new Path(uri);
>>
>> SequenceFile.Reader reader =
>>
>> null;
>>
>> org.apache.hadoop.io.LongWritable key =
>>
>> new org.apache.hadoop.io.LongWritable();
>>
>> org.apache.hadoop.io.Text value =
>>
>> new org.apache.hadoop.io.Text();
>>
>> try {
>>
>> reader = new SequenceFile.Reader(fs, path, conf);
>
>



-- 
Harsh J

Re: Reading from sequence file using java FS api

Posted by Harsh J <ha...@cloudera.com>.
Yes, the codec information is stored in the file's header.

Same goes for Avro where even the deserialization schema logic is
stored in addition, so you can just directly read into usable
primitive/compound objects and not have to do manual transformation
work.

On Tue, Nov 13, 2012 at 6:07 AM, Mohit Anchlia <mo...@gmail.com> wrote:
> I was simple able to read using below code. Didn't have to decompress. It
> looks like reader automatically knows and decompresses the file before
> returning it to the user.
>
>
> On Mon, Nov 12, 2012 at 3:16 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
>>
>> I am looking for an example that read snappy compressed snappy file. Could
>> someone point me to it? What I have so far is this:
>>
>>
>> Configuration conf =
>>
>> new Configuration();
>>
>> FileSystem fs = FileSystem.get(URI.create(uri), conf);
>>
>> Path path =
>>
>> new Path(uri);
>>
>> SequenceFile.Reader reader =
>>
>> null;
>>
>> org.apache.hadoop.io.LongWritable key =
>>
>> new org.apache.hadoop.io.LongWritable();
>>
>> org.apache.hadoop.io.Text value =
>>
>> new org.apache.hadoop.io.Text();
>>
>> try {
>>
>> reader = new SequenceFile.Reader(fs, path, conf);
>
>



-- 
Harsh J

Re: Reading from sequence file using java FS api

Posted by Harsh J <ha...@cloudera.com>.
Yes, the codec information is stored in the file's header.

Same goes for Avro where even the deserialization schema logic is
stored in addition, so you can just directly read into usable
primitive/compound objects and not have to do manual transformation
work.

On Tue, Nov 13, 2012 at 6:07 AM, Mohit Anchlia <mo...@gmail.com> wrote:
> I was simple able to read using below code. Didn't have to decompress. It
> looks like reader automatically knows and decompresses the file before
> returning it to the user.
>
>
> On Mon, Nov 12, 2012 at 3:16 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
>>
>> I am looking for an example that read snappy compressed snappy file. Could
>> someone point me to it? What I have so far is this:
>>
>>
>> Configuration conf =
>>
>> new Configuration();
>>
>> FileSystem fs = FileSystem.get(URI.create(uri), conf);
>>
>> Path path =
>>
>> new Path(uri);
>>
>> SequenceFile.Reader reader =
>>
>> null;
>>
>> org.apache.hadoop.io.LongWritable key =
>>
>> new org.apache.hadoop.io.LongWritable();
>>
>> org.apache.hadoop.io.Text value =
>>
>> new org.apache.hadoop.io.Text();
>>
>> try {
>>
>> reader = new SequenceFile.Reader(fs, path, conf);
>
>



-- 
Harsh J

Re: Reading from sequence file using java FS api

Posted by Harsh J <ha...@cloudera.com>.
Yes, the codec information is stored in the file's header.

Same goes for Avro where even the deserialization schema logic is
stored in addition, so you can just directly read into usable
primitive/compound objects and not have to do manual transformation
work.

On Tue, Nov 13, 2012 at 6:07 AM, Mohit Anchlia <mo...@gmail.com> wrote:
> I was simple able to read using below code. Didn't have to decompress. It
> looks like reader automatically knows and decompresses the file before
> returning it to the user.
>
>
> On Mon, Nov 12, 2012 at 3:16 PM, Mohit Anchlia <mo...@gmail.com>
> wrote:
>>
>> I am looking for an example that read snappy compressed snappy file. Could
>> someone point me to it? What I have so far is this:
>>
>>
>> Configuration conf =
>>
>> new Configuration();
>>
>> FileSystem fs = FileSystem.get(URI.create(uri), conf);
>>
>> Path path =
>>
>> new Path(uri);
>>
>> SequenceFile.Reader reader =
>>
>> null;
>>
>> org.apache.hadoop.io.LongWritable key =
>>
>> new org.apache.hadoop.io.LongWritable();
>>
>> org.apache.hadoop.io.Text value =
>>
>> new org.apache.hadoop.io.Text();
>>
>> try {
>>
>> reader = new SequenceFile.Reader(fs, path, conf);
>
>



-- 
Harsh J

Re: Reading from sequence file using java FS api

Posted by Mohit Anchlia <mo...@gmail.com>.
I was simple able to read using below code. Didn't have to decompress. It
looks like reader automatically knows and decompresses the file before
returning it to the user.

On Mon, Nov 12, 2012 at 3:16 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> I am looking for an example that read snappy compressed snappy file. Could
> someone point me to it? What I have so far is this:
>
>
> Configuration conf =
> *new* Configuration();
>
> FileSystem fs = FileSystem.*get*(URI.*create*(uri), conf);
>
> Path path =
> *new* Path(uri);
>
> SequenceFile.Reader reader =
> *null*;
>
> org.apache.hadoop.io.LongWritable key =
> *new* org.apache.hadoop.io.LongWritable();
>
> org.apache.hadoop.io.Text value =
> *new* org.apache.hadoop.io.Text();
>
> *try* {
>
> reader = *new* SequenceFile.Reader(fs, path, conf);
>

Re: Reading from sequence file using java FS api

Posted by Mohit Anchlia <mo...@gmail.com>.
I was simple able to read using below code. Didn't have to decompress. It
looks like reader automatically knows and decompresses the file before
returning it to the user.

On Mon, Nov 12, 2012 at 3:16 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> I am looking for an example that read snappy compressed snappy file. Could
> someone point me to it? What I have so far is this:
>
>
> Configuration conf =
> *new* Configuration();
>
> FileSystem fs = FileSystem.*get*(URI.*create*(uri), conf);
>
> Path path =
> *new* Path(uri);
>
> SequenceFile.Reader reader =
> *null*;
>
> org.apache.hadoop.io.LongWritable key =
> *new* org.apache.hadoop.io.LongWritable();
>
> org.apache.hadoop.io.Text value =
> *new* org.apache.hadoop.io.Text();
>
> *try* {
>
> reader = *new* SequenceFile.Reader(fs, path, conf);
>

Re: Reading from sequence file using java FS api

Posted by Mohit Anchlia <mo...@gmail.com>.
I was simple able to read using below code. Didn't have to decompress. It
looks like reader automatically knows and decompresses the file before
returning it to the user.

On Mon, Nov 12, 2012 at 3:16 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> I am looking for an example that read snappy compressed snappy file. Could
> someone point me to it? What I have so far is this:
>
>
> Configuration conf =
> *new* Configuration();
>
> FileSystem fs = FileSystem.*get*(URI.*create*(uri), conf);
>
> Path path =
> *new* Path(uri);
>
> SequenceFile.Reader reader =
> *null*;
>
> org.apache.hadoop.io.LongWritable key =
> *new* org.apache.hadoop.io.LongWritable();
>
> org.apache.hadoop.io.Text value =
> *new* org.apache.hadoop.io.Text();
>
> *try* {
>
> reader = *new* SequenceFile.Reader(fs, path, conf);
>

Re: Reading from sequence file using java FS api

Posted by Mohit Anchlia <mo...@gmail.com>.
I was simple able to read using below code. Didn't have to decompress. It
looks like reader automatically knows and decompresses the file before
returning it to the user.

On Mon, Nov 12, 2012 at 3:16 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> I am looking for an example that read snappy compressed snappy file. Could
> someone point me to it? What I have so far is this:
>
>
> Configuration conf =
> *new* Configuration();
>
> FileSystem fs = FileSystem.*get*(URI.*create*(uri), conf);
>
> Path path =
> *new* Path(uri);
>
> SequenceFile.Reader reader =
> *null*;
>
> org.apache.hadoop.io.LongWritable key =
> *new* org.apache.hadoop.io.LongWritable();
>
> org.apache.hadoop.io.Text value =
> *new* org.apache.hadoop.io.Text();
>
> *try* {
>
> reader = *new* SequenceFile.Reader(fs, path, conf);
>