You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Patrick Schless <pa...@gmail.com> on 2013/06/17 21:20:31 UTC

CopyTable

Context:
I'm working on getting replication set up, and a prerequisite for me is to
rename the table (since you have to replicate to the same name as the
source). For this, I'm testing a CopyTable strategy, since there doesn't
seem to be a good way to rename a table (please correct me if I'm wrong).

My question:
According to [1], the CopyTable job takes an argument "all.cells" which "Also
copy delete markers and uncollected deleted cells (advanced option)."

I'm confused by the "advanced option" bit. When would you not want to copy
deletes over to a new table? Without that, it seems like you could end up
with more data than you were expecting in the target table.

Any info would be helpful.

Thanks,
Patrick


[1] http://hbase.apache.org/book/ops_mgt.html#copytable

Re: CopyTable

Posted by Asaf Mesika <as...@gmail.com>.
Did you guys thought about adding coprocessors hooks to the replication,
like preReplicateLogEntries, or something like that? I mean, in his case,
perhaps utilizing it could have changed the table name before running
through the replication process at the sink RS.



On Fri, Jun 21, 2013 at 1:48 AM, lars hofhansl <la...@apache.org> wrote:

> I added that, but again only in 0.94 :(
>
>
>
> ----- Original Message -----
> From: Patrick Schless <pa...@gmail.com>
> To: user <us...@hbase.apache.org>
> Cc:
> Sent: Thursday, June 20, 2013 3:39 PM
> Subject: Re: CopyTable
>
> In my case, I can't disable the table, but I can pause updates (for long
> enough to switch to an identical-but-renamed table).
>
> The CopyTable job runs well, and subsequent runs (after the first) run very
> quickly (under a minute). The problem is that it doesn't include deletes
> (the "all.cells" option listed in the docs isn't available in 0.92). Is
> there any other way to get incremental backups, including deletes?
>
>
> On Wed, Jun 19, 2013 at 12:24 PM, Matteo Bertozzi
> <th...@gmail.com>wrote:
>
> > On 0.92 you can use the latest rename script posted on the HBASE-7896
> jira.
> > note that in this case you've to disable your table first.
> >
> >
> > Matteo
> >
> >
> >
> > On Wed, Jun 19, 2013 at 6:19 PM, Patrick Schless
> > <pa...@gmail.com>wrote:
> >
> > > Unfortunately, I'm on 0.92.1, and the snapshot approach you linked
> isn't
> > > available until 0.94. Bummer, looked cool.
> > >
> > > Anybody have any insight into the question around the CopyTable
> process?
> > Or
> > > know another way to rename a table in 0.92.1?
> > >
> > > Thanks,
> > > Patrick
> > >
> > >
> > > On Mon, Jun 17, 2013 at 3:21 PM, Patrick Schless
> > > <pa...@gmail.com>wrote:
> > >
> > > > Sweet, I'll give that a try (I hadn't seen that before), thanks.
> > > >
> > > > If it's not super fast (under a few minutes), I'll still have to go
> > with
> > > > the CopyTable approach, though. I'm currently testing, but my
> > assumption
> > > is
> > > > that I can do a series of CopyTables (all but the first CopyTable
> would
> > > > specify a starttime of when the previous job began) will end up with
> > > only a
> > > > small period of downtime (the final CopyTable).
> > > >
> > > >
> > > > On Mon, Jun 17, 2013 at 3:13 PM, Ted Yu <yu...@gmail.com> wrote:
> > > >
> > > >> bq.  since there doesn't seem to be a good way to rename a table
> > > >>
> > > >> Have you looked at http://hbase.apache.org/book.html#table.rename ?
> > > >>
> > > >> Cheers
> > > >>
> > > >> On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless <
> > > >> patrick.schless@gmail.com
> > > >> > wrote:
> > > >>
> > > >> > Context:
> > > >> > I'm working on getting replication set up, and a prerequisite for
> me
> > > is
> > > >> to
> > > >> > rename the table (since you have to replicate to the same name as
> > the
> > > >> > source). For this, I'm testing a CopyTable strategy, since there
> > > doesn't
> > > >> > seem to be a good way to rename a table (please correct me if I'm
> > > >> wrong).
> > > >> >
> > > >> > My question:
> > > >> > According to [1], the CopyTable job takes an argument "all.cells"
> > > which
> > > >> > "Also
> > > >> > copy delete markers and uncollected deleted cells (advanced
> > option)."
> > > >> >
> > > >> > I'm confused by the "advanced option" bit. When would you not want
> > to
> > > >> copy
> > > >> > deletes over to a new table? Without that, it seems like you could
> > end
> > > >> up
> > > >> > with more data than you were expecting in the target table.
> > > >> >
> > > >> > Any info would be helpful.
> > > >> >
> > > >> > Thanks,
> > > >> > Patrick
> > > >> >
> > > >> >
> > > >> > [1] http://hbase.apache.org/book/ops_mgt.html#copytable
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>
>

Re: CopyTable

Posted by lars hofhansl <la...@apache.org>.
I added that, but again only in 0.94 :(



----- Original Message -----
From: Patrick Schless <pa...@gmail.com>
To: user <us...@hbase.apache.org>
Cc: 
Sent: Thursday, June 20, 2013 3:39 PM
Subject: Re: CopyTable

In my case, I can't disable the table, but I can pause updates (for long
enough to switch to an identical-but-renamed table).

The CopyTable job runs well, and subsequent runs (after the first) run very
quickly (under a minute). The problem is that it doesn't include deletes
(the "all.cells" option listed in the docs isn't available in 0.92). Is
there any other way to get incremental backups, including deletes?


On Wed, Jun 19, 2013 at 12:24 PM, Matteo Bertozzi
<th...@gmail.com>wrote:

> On 0.92 you can use the latest rename script posted on the HBASE-7896 jira.
> note that in this case you've to disable your table first.
>
>
> Matteo
>
>
>
> On Wed, Jun 19, 2013 at 6:19 PM, Patrick Schless
> <pa...@gmail.com>wrote:
>
> > Unfortunately, I'm on 0.92.1, and the snapshot approach you linked isn't
> > available until 0.94. Bummer, looked cool.
> >
> > Anybody have any insight into the question around the CopyTable process?
> Or
> > know another way to rename a table in 0.92.1?
> >
> > Thanks,
> > Patrick
> >
> >
> > On Mon, Jun 17, 2013 at 3:21 PM, Patrick Schless
> > <pa...@gmail.com>wrote:
> >
> > > Sweet, I'll give that a try (I hadn't seen that before), thanks.
> > >
> > > If it's not super fast (under a few minutes), I'll still have to go
> with
> > > the CopyTable approach, though. I'm currently testing, but my
> assumption
> > is
> > > that I can do a series of CopyTables (all but the first CopyTable would
> > > specify a starttime of when the previous job began) will end up with
> > only a
> > > small period of downtime (the final CopyTable).
> > >
> > >
> > > On Mon, Jun 17, 2013 at 3:13 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > >> bq.  since there doesn't seem to be a good way to rename a table
> > >>
> > >> Have you looked at http://hbase.apache.org/book.html#table.rename ?
> > >>
> > >> Cheers
> > >>
> > >> On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless <
> > >> patrick.schless@gmail.com
> > >> > wrote:
> > >>
> > >> > Context:
> > >> > I'm working on getting replication set up, and a prerequisite for me
> > is
> > >> to
> > >> > rename the table (since you have to replicate to the same name as
> the
> > >> > source). For this, I'm testing a CopyTable strategy, since there
> > doesn't
> > >> > seem to be a good way to rename a table (please correct me if I'm
> > >> wrong).
> > >> >
> > >> > My question:
> > >> > According to [1], the CopyTable job takes an argument "all.cells"
> > which
> > >> > "Also
> > >> > copy delete markers and uncollected deleted cells (advanced
> option)."
> > >> >
> > >> > I'm confused by the "advanced option" bit. When would you not want
> to
> > >> copy
> > >> > deletes over to a new table? Without that, it seems like you could
> end
> > >> up
> > >> > with more data than you were expecting in the target table.
> > >> >
> > >> > Any info would be helpful.
> > >> >
> > >> > Thanks,
> > >> > Patrick
> > >> >
> > >> >
> > >> > [1] http://hbase.apache.org/book/ops_mgt.html#copytable
> > >> >
> > >>
> > >
> > >
> >
>


Re: CopyTable

Posted by Patrick Schless <pa...@gmail.com>.
In my case, I can't disable the table, but I can pause updates (for long
enough to switch to an identical-but-renamed table).

The CopyTable job runs well, and subsequent runs (after the first) run very
quickly (under a minute). The problem is that it doesn't include deletes
(the "all.cells" option listed in the docs isn't available in 0.92). Is
there any other way to get incremental backups, including deletes?


On Wed, Jun 19, 2013 at 12:24 PM, Matteo Bertozzi
<th...@gmail.com>wrote:

> On 0.92 you can use the latest rename script posted on the HBASE-7896 jira.
> note that in this case you've to disable your table first.
>
>
> Matteo
>
>
>
> On Wed, Jun 19, 2013 at 6:19 PM, Patrick Schless
> <pa...@gmail.com>wrote:
>
> > Unfortunately, I'm on 0.92.1, and the snapshot approach you linked isn't
> > available until 0.94. Bummer, looked cool.
> >
> > Anybody have any insight into the question around the CopyTable process?
> Or
> > know another way to rename a table in 0.92.1?
> >
> > Thanks,
> > Patrick
> >
> >
> > On Mon, Jun 17, 2013 at 3:21 PM, Patrick Schless
> > <pa...@gmail.com>wrote:
> >
> > > Sweet, I'll give that a try (I hadn't seen that before), thanks.
> > >
> > > If it's not super fast (under a few minutes), I'll still have to go
> with
> > > the CopyTable approach, though. I'm currently testing, but my
> assumption
> > is
> > > that I can do a series of CopyTables (all but the first CopyTable would
> > > specify a starttime of when the previous job began) will end up with
> > only a
> > > small period of downtime (the final CopyTable).
> > >
> > >
> > > On Mon, Jun 17, 2013 at 3:13 PM, Ted Yu <yu...@gmail.com> wrote:
> > >
> > >> bq.  since there doesn't seem to be a good way to rename a table
> > >>
> > >> Have you looked at http://hbase.apache.org/book.html#table.rename ?
> > >>
> > >> Cheers
> > >>
> > >> On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless <
> > >> patrick.schless@gmail.com
> > >> > wrote:
> > >>
> > >> > Context:
> > >> > I'm working on getting replication set up, and a prerequisite for me
> > is
> > >> to
> > >> > rename the table (since you have to replicate to the same name as
> the
> > >> > source). For this, I'm testing a CopyTable strategy, since there
> > doesn't
> > >> > seem to be a good way to rename a table (please correct me if I'm
> > >> wrong).
> > >> >
> > >> > My question:
> > >> > According to [1], the CopyTable job takes an argument "all.cells"
> > which
> > >> > "Also
> > >> > copy delete markers and uncollected deleted cells (advanced
> option)."
> > >> >
> > >> > I'm confused by the "advanced option" bit. When would you not want
> to
> > >> copy
> > >> > deletes over to a new table? Without that, it seems like you could
> end
> > >> up
> > >> > with more data than you were expecting in the target table.
> > >> >
> > >> > Any info would be helpful.
> > >> >
> > >> > Thanks,
> > >> > Patrick
> > >> >
> > >> >
> > >> > [1] http://hbase.apache.org/book/ops_mgt.html#copytable
> > >> >
> > >>
> > >
> > >
> >
>

Re: CopyTable

Posted by Matteo Bertozzi <th...@gmail.com>.
On 0.92 you can use the latest rename script posted on the HBASE-7896 jira.
note that in this case you've to disable your table first.


Matteo



On Wed, Jun 19, 2013 at 6:19 PM, Patrick Schless
<pa...@gmail.com>wrote:

> Unfortunately, I'm on 0.92.1, and the snapshot approach you linked isn't
> available until 0.94. Bummer, looked cool.
>
> Anybody have any insight into the question around the CopyTable process? Or
> know another way to rename a table in 0.92.1?
>
> Thanks,
> Patrick
>
>
> On Mon, Jun 17, 2013 at 3:21 PM, Patrick Schless
> <pa...@gmail.com>wrote:
>
> > Sweet, I'll give that a try (I hadn't seen that before), thanks.
> >
> > If it's not super fast (under a few minutes), I'll still have to go with
> > the CopyTable approach, though. I'm currently testing, but my assumption
> is
> > that I can do a series of CopyTables (all but the first CopyTable would
> > specify a starttime of when the previous job began) will end up with
> only a
> > small period of downtime (the final CopyTable).
> >
> >
> > On Mon, Jun 17, 2013 at 3:13 PM, Ted Yu <yu...@gmail.com> wrote:
> >
> >> bq.  since there doesn't seem to be a good way to rename a table
> >>
> >> Have you looked at http://hbase.apache.org/book.html#table.rename ?
> >>
> >> Cheers
> >>
> >> On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless <
> >> patrick.schless@gmail.com
> >> > wrote:
> >>
> >> > Context:
> >> > I'm working on getting replication set up, and a prerequisite for me
> is
> >> to
> >> > rename the table (since you have to replicate to the same name as the
> >> > source). For this, I'm testing a CopyTable strategy, since there
> doesn't
> >> > seem to be a good way to rename a table (please correct me if I'm
> >> wrong).
> >> >
> >> > My question:
> >> > According to [1], the CopyTable job takes an argument "all.cells"
> which
> >> > "Also
> >> > copy delete markers and uncollected deleted cells (advanced option)."
> >> >
> >> > I'm confused by the "advanced option" bit. When would you not want to
> >> copy
> >> > deletes over to a new table? Without that, it seems like you could end
> >> up
> >> > with more data than you were expecting in the target table.
> >> >
> >> > Any info would be helpful.
> >> >
> >> > Thanks,
> >> > Patrick
> >> >
> >> >
> >> > [1] http://hbase.apache.org/book/ops_mgt.html#copytable
> >> >
> >>
> >
> >
>

Re: CopyTable

Posted by Patrick Schless <pa...@gmail.com>.
Unfortunately, I'm on 0.92.1, and the snapshot approach you linked isn't
available until 0.94. Bummer, looked cool.

Anybody have any insight into the question around the CopyTable process? Or
know another way to rename a table in 0.92.1?

Thanks,
Patrick


On Mon, Jun 17, 2013 at 3:21 PM, Patrick Schless
<pa...@gmail.com>wrote:

> Sweet, I'll give that a try (I hadn't seen that before), thanks.
>
> If it's not super fast (under a few minutes), I'll still have to go with
> the CopyTable approach, though. I'm currently testing, but my assumption is
> that I can do a series of CopyTables (all but the first CopyTable would
> specify a starttime of when the previous job began) will end up with only a
> small period of downtime (the final CopyTable).
>
>
> On Mon, Jun 17, 2013 at 3:13 PM, Ted Yu <yu...@gmail.com> wrote:
>
>> bq.  since there doesn't seem to be a good way to rename a table
>>
>> Have you looked at http://hbase.apache.org/book.html#table.rename ?
>>
>> Cheers
>>
>> On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless <
>> patrick.schless@gmail.com
>> > wrote:
>>
>> > Context:
>> > I'm working on getting replication set up, and a prerequisite for me is
>> to
>> > rename the table (since you have to replicate to the same name as the
>> > source). For this, I'm testing a CopyTable strategy, since there doesn't
>> > seem to be a good way to rename a table (please correct me if I'm
>> wrong).
>> >
>> > My question:
>> > According to [1], the CopyTable job takes an argument "all.cells" which
>> > "Also
>> > copy delete markers and uncollected deleted cells (advanced option)."
>> >
>> > I'm confused by the "advanced option" bit. When would you not want to
>> copy
>> > deletes over to a new table? Without that, it seems like you could end
>> up
>> > with more data than you were expecting in the target table.
>> >
>> > Any info would be helpful.
>> >
>> > Thanks,
>> > Patrick
>> >
>> >
>> > [1] http://hbase.apache.org/book/ops_mgt.html#copytable
>> >
>>
>
>

Re: CopyTable

Posted by Patrick Schless <pa...@gmail.com>.
Sweet, I'll give that a try (I hadn't seen that before), thanks.

If it's not super fast (under a few minutes), I'll still have to go with
the CopyTable approach, though. I'm currently testing, but my assumption is
that I can do a series of CopyTables (all but the first CopyTable would
specify a starttime of when the previous job began) will end up with only a
small period of downtime (the final CopyTable).


On Mon, Jun 17, 2013 at 3:13 PM, Ted Yu <yu...@gmail.com> wrote:

> bq.  since there doesn't seem to be a good way to rename a table
>
> Have you looked at http://hbase.apache.org/book.html#table.rename ?
>
> Cheers
>
> On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless <
> patrick.schless@gmail.com
> > wrote:
>
> > Context:
> > I'm working on getting replication set up, and a prerequisite for me is
> to
> > rename the table (since you have to replicate to the same name as the
> > source). For this, I'm testing a CopyTable strategy, since there doesn't
> > seem to be a good way to rename a table (please correct me if I'm wrong).
> >
> > My question:
> > According to [1], the CopyTable job takes an argument "all.cells" which
> > "Also
> > copy delete markers and uncollected deleted cells (advanced option)."
> >
> > I'm confused by the "advanced option" bit. When would you not want to
> copy
> > deletes over to a new table? Without that, it seems like you could end up
> > with more data than you were expecting in the target table.
> >
> > Any info would be helpful.
> >
> > Thanks,
> > Patrick
> >
> >
> > [1] http://hbase.apache.org/book/ops_mgt.html#copytable
> >
>

Re: CopyTable

Posted by Ted Yu <yu...@gmail.com>.
bq.  since there doesn't seem to be a good way to rename a table

Have you looked at http://hbase.apache.org/book.html#table.rename ?

Cheers

On Mon, Jun 17, 2013 at 12:20 PM, Patrick Schless <patrick.schless@gmail.com
> wrote:

> Context:
> I'm working on getting replication set up, and a prerequisite for me is to
> rename the table (since you have to replicate to the same name as the
> source). For this, I'm testing a CopyTable strategy, since there doesn't
> seem to be a good way to rename a table (please correct me if I'm wrong).
>
> My question:
> According to [1], the CopyTable job takes an argument "all.cells" which
> "Also
> copy delete markers and uncollected deleted cells (advanced option)."
>
> I'm confused by the "advanced option" bit. When would you not want to copy
> deletes over to a new table? Without that, it seems like you could end up
> with more data than you were expecting in the target table.
>
> Any info would be helpful.
>
> Thanks,
> Patrick
>
>
> [1] http://hbase.apache.org/book/ops_mgt.html#copytable
>