You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Bin YANG <ya...@gmail.com> on 2007/12/05 11:25:35 UTC
Dose hadoop just provide reading by line
hi colleague,
Does hadoop just provide reading the files from line to line?
How can I read many lines from a file?
thanks
--
Bin YANG
Department of Computer Science and Engineering
Fudan University
Shanghai, P. R. China
EMail: yangbinisme82@gmail.com
Re: Dose hadoop just provide reading by line
Posted by Fabrice Medio <fm...@discoverymining.com>.
Are you talking about reading arbitrary files from HDFS?
You can just get a regular InputStream to a Path:
JobConf conf = new JobConf(SomeJob.class);
FileSystem hdfs = FileSystem.get(conf);
FSDataInputStream inputStream = hdfs.open(new Path("/my/path"));
BufferedReader reader = new BufferedReader(new
InputStreamReader(inputStream));
String s = reader.readLine();
Fabrice
Bin YANG wrote:
> hi colleague,
>
> Does hadoop just provide reading the files from line to line?
> How can I read many lines from a file?
>
> thanks
>
Re: Dose hadoop just provide reading by line
Posted by André Martin <ma...@andremartin.de>.
Hi Bin YANG,
it can read as many lines/bytes as you want - but you need to implement
your own RecordReader
(http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/RecordReader.html)
for this (and InputFormat in order to make it usable for your jobs...)
Just take a look at the LineRecordReader as a starting point on how to
implement a RecordReader.
Cu on the 'net,
Bye - bye,
<<<<< André <<<< >>>> èrbnA >>>>>
...you wrote:
> hi colleague,
>
> Does hadoop just provide reading the files from line to line?
> How can I read many lines from a file?
>
> thanks