You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Bulldog20630405 <bu...@gmail.com> on 2020/01/10 22:21:15 UTC

parsing rfile (e.g. AccumuloFileInputFormat)

accumulo has an AccumuloFileOutputFormat which i use all the time; however,
there is not an AccumuloFileInputFormat.  does anyone have an example? or
at least how an rfile structure is so i can parse it one key at a time?

Re: parsing rfile (e.g. AccumuloFileInputFormat)

Posted by Bulldog20630405 <bu...@gmail.com>.
thanx; this will be a great help!

On Fri, Jan 10, 2020 at 7:28 PM Christopher <ct...@apache.org> wrote:

> The code that backs the `bin/accumulo rfile-info
> hdfs://path/to/rfile.rf` command is located at
>
> https://github.com/apache/accumulo/blob/3fd5cad92f9b63ac19e4466f3f2d5237b905262c/core/src/main/java/org/apache/accumulo/core/file/rfile/PrintInfo.java
>
> It may be a useful example of how to read key/value pairs from an
> RFile to implement such a thing.
>
> However, that code uses a lot of internal APIs. We have a public API
> (since 1.8) for reading from RFiles that is probably better to use:
>
> https://accumulo.apache.org/docs/2.x/apidocs/org/apache/accumulo/core/client/rfile/RFile.html#newScanner()
>
> On Fri, Jan 10, 2020 at 5:21 PM Bulldog20630405
> <bu...@gmail.com> wrote:
> >
> >
> > accumulo has an AccumuloFileOutputFormat which i use all the time;
> however, there is not an AccumuloFileInputFormat.  does anyone have an
> example? or at least how an rfile structure is so i can parse it one key at
> a time?
> >
> >
>

Re: parsing rfile (e.g. AccumuloFileInputFormat)

Posted by Christopher <ct...@apache.org>.
The code that backs the `bin/accumulo rfile-info
hdfs://path/to/rfile.rf` command is located at
https://github.com/apache/accumulo/blob/3fd5cad92f9b63ac19e4466f3f2d5237b905262c/core/src/main/java/org/apache/accumulo/core/file/rfile/PrintInfo.java

It may be a useful example of how to read key/value pairs from an
RFile to implement such a thing.

However, that code uses a lot of internal APIs. We have a public API
(since 1.8) for reading from RFiles that is probably better to use:
https://accumulo.apache.org/docs/2.x/apidocs/org/apache/accumulo/core/client/rfile/RFile.html#newScanner()

On Fri, Jan 10, 2020 at 5:21 PM Bulldog20630405
<bu...@gmail.com> wrote:
>
>
> accumulo has an AccumuloFileOutputFormat which i use all the time; however, there is not an AccumuloFileInputFormat.  does anyone have an example? or at least how an rfile structure is so i can parse it one key at a time?
>
>