You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@accumulo.apache.org by "Christopher Tubbs (JIRA)" <ji...@apache.org> on 2019/04/22 19:56:00 UTC

[jira] [Resolved] (ACCUMULO-418) Make RFiles splittable

     [ https://issues.apache.org/jira/browse/ACCUMULO-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christopher Tubbs resolved ACCUMULO-418.
----------------------------------------
    Resolution: Won't Fix

This is quite old. There have been numerous changes since this issue was created, which may help mitigate the issue, and there are other possible external solutions to this.

1. There is now an RFile API.
2. A user could create their own InputFormat with InputSplit types that accept a file name and a range.

In any case, if this is still an issue that somebody wishes to pursue, please open an issue on GitHub, where we now track issues: https://github.com/apache/accumulo/issues

> Make RFiles splittable
> ----------------------
>
>                 Key: ACCUMULO-418
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-418
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: master, tserver
>         Environment: All
>            Reporter: Ivan Bella
>            Priority: Major
>              Labels: RFile, hadoop, mapreduce
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> There are times when iterating over RFiles is useful in map-reduce jobs.  I know that RFiles logically can be split on the block boundary, however there is no easy way to do this currently as there is no RFile RecordReader or InputFormat provided.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)