You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Nick Dimiduk <nd...@gmail.com> on 2014/07/23 20:56:58 UTC

HFileLink backreferences

Heya,

I see that we maintain backreferences for hfilelinks. This appears to be
used by HFileLinkCleaner to determine when an HFile has no HFileLinks and
thus whether it can be deleted without orphaning those links.

This is problematic for using the mapreduce over snapshot files feature.
However, restoring a snapshot can create HFileLinks to existing files in
the restore directory. Those links then create back-references to the root
path. Thus we have a situation where the user running the MR job requires
write access to the hbase root path.

Was this already discussed in the original ticket (HBASE-8369)? We mention
the requirement of read permissions (and bypassing security) in the release
note, but I didn't note any comments for write. Requiring write permission
effectively means you can only MR as the hbase user, which is pretty much a
non-starter for any interesting integration of feature.

Thoughts?

Thanks,
Nick

Re: HFileLink backreferences

Posted by Enis Söztutar <en...@gmail.com>.
It might be possible to do it so that hfilelink does not create the
backreferences in this case only read access is required.

Exporting the snapshot first into a different dir, then restoring and
running the snapshots on top is also another solution.


On Wed, Jul 23, 2014 at 12:44 PM, D vd Reddy <dv...@gmail.com>
wrote:

> Hi,
>
> This was the same issue, we were seeing when we wanted to use this. And
> also this was also difficult when we wanted to use this for HBase - Hive
> integration over snapshot (see HBASE-11484). I think what Matteo Bertozzi
> <https://issues.apache.org/jira/secure/ViewProfile.jspa?name=mbertozzi>
> suggested
> in the the jira HBASE-11484 can be a nice solution for this, where we
> provide a way to run the MR without restoring the snapshot
>
>
> On Wed, Jul 23, 2014 at 11:56 AM, Nick Dimiduk <nd...@gmail.com> wrote:
>
> > Heya,
> >
> > I see that we maintain backreferences for hfilelinks. This appears to be
> > used by HFileLinkCleaner to determine when an HFile has no HFileLinks and
> > thus whether it can be deleted without orphaning those links.
> >
> > This is problematic for using the mapreduce over snapshot files feature.
> > However, restoring a snapshot can create HFileLinks to existing files in
> > the restore directory. Those links then create back-references to the
> root
> > path. Thus we have a situation where the user running the MR job requires
> > write access to the hbase root path.
> >
> > Was this already discussed in the original ticket (HBASE-8369)? We
> mention
> > the requirement of read permissions (and bypassing security) in the
> release
> > note, but I didn't note any comments for write. Requiring write
> permission
> > effectively means you can only MR as the hbase user, which is pretty
> much a
> > non-starter for any interesting integration of feature.
> >
> > Thoughts?
> >
> > Thanks,
> > Nick
> >
>

Re: HFileLink backreferences

Posted by D vd Reddy <dv...@gmail.com>.
Hi,

This was the same issue, we were seeing when we wanted to use this. And
also this was also difficult when we wanted to use this for HBase - Hive
integration over snapshot (see HBASE-11484). I think what Matteo Bertozzi
<https://issues.apache.org/jira/secure/ViewProfile.jspa?name=mbertozzi>
suggested
in the the jira HBASE-11484 can be a nice solution for this, where we
provide a way to run the MR without restoring the snapshot


On Wed, Jul 23, 2014 at 11:56 AM, Nick Dimiduk <nd...@gmail.com> wrote:

> Heya,
>
> I see that we maintain backreferences for hfilelinks. This appears to be
> used by HFileLinkCleaner to determine when an HFile has no HFileLinks and
> thus whether it can be deleted without orphaning those links.
>
> This is problematic for using the mapreduce over snapshot files feature.
> However, restoring a snapshot can create HFileLinks to existing files in
> the restore directory. Those links then create back-references to the root
> path. Thus we have a situation where the user running the MR job requires
> write access to the hbase root path.
>
> Was this already discussed in the original ticket (HBASE-8369)? We mention
> the requirement of read permissions (and bypassing security) in the release
> note, but I didn't note any comments for write. Requiring write permission
> effectively means you can only MR as the hbase user, which is pretty much a
> non-starter for any interesting integration of feature.
>
> Thoughts?
>
> Thanks,
> Nick
>