You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@phoenix.apache.org by "김영우 (YoungWoo Kim)" <yw...@apache.org> on 2017/01/05 13:46:50 UTC

Dropping local index on large table caused crash of region servers

Hi,

I've faced crash of RS when I'm dropping local index from very large table.
I have several tables with billions of row and the tables are hosted on
HBase 1.2.4 and Phoenix 4.9.0

Each region server reported 'gc pause' from their logs and it's strange to
me because, the log messages are popped up when I drop local indexes.
eventually, my query failed with timeout. any other operations like
insert/bulkload data or querying data works fine.

I'm not sure but dropping local indexes from large table is expensive. Is
there any workaround for this?

Thanks,
Youngwoo

Re: Dropping local index on large table caused crash of region servers

Posted by "김영우 (Youngwoo Kim)" <wa...@gmail.com>.

Thank you guys. Good to know that.

Samarth,
Following is the output from CLI:

> drop index TBL_IDX1 ON MYSCHEMA.TBL;
Error: Operation timed out. (state=TIM01,code=6000)
java.sql.SQLTimeoutException: Operation timed out.
at
org.apache.phoenix.exception.SQLExceptionCode$15.newException(SQLExceptionCode.java:387)
at
org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
at
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:788)
at
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:699)
at
org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:50)
at
org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:97)
at
org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:117)
at
org.apache.phoenix.iterate.BaseGroupedAggregatingResultIterator.next(BaseGroupedAggregatingResultIterator.java:64)
at
org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39)
at
org.apache.phoenix.compile.PostDDLCompiler$2.execute(PostDDLCompiler.java:287)
at
org.apache.phoenix.query.ConnectionQueryServicesImpl.updateData(ConnectionQueryServicesImpl.java:3186)
at
org.apache.phoenix.schema.MetaDataClient.dropTable(MetaDataClient.java:2543)
at
org.apache.phoenix.schema.MetaDataClient.dropIndex(MetaDataClient.java:2412)
at
org.apache.phoenix.jdbc.PhoenixStatement$ExecutableDropIndexStatement$1.execute(PhoenixStatement.java:984)
at
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:358)
at
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:341)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:340)
at
org.apache.phoenix.jdbc.PhoenixStatement.execute(PhoenixStatement.java:1511)
at sqlline.Commands.execute(Commands.java:822)
at sqlline.Commands.sql(Commands.java:732)
at sqlline.SqlLine.dispatch(SqlLine.java:813)
at sqlline.SqlLine.begin(SqlLine.java:686)
at sqlline.SqlLine.start(SqlLine.java:398)
at sqlline.SqlLine.main(SqlLine.java:291)

On Fri, Jan 6, 2017 at 4:11 AM, Samarth Jain <sa...@gmail.com> wrote:

> That is correct. Renew lease feature is available starting Phoenix 4.7.0+
> and HBase 1.1.3+. Youngwoo, it would be great if you can provide us
> stacktraces of the timeout exceptions you are seeing.
>
> On Thu, Jan 5, 2017 at 10:37 AM, James Taylor <ja...@apache.org>
> wrote:
>
> > It should not be necessary to increase the RPC timeout when dropping an
> > index with the renew lease changes we made (Phoenix 4.7.0+ and HBase
> > 1.1.3+). If it still is, please file a JIRA.
> >
> > Samarth - can you confirm the version requirements?
> >
> > Also - take a look at PHOENIX-3262. It'd be an interesting approach to
> > dropping a local index.
> >
> > Thanks,
> > James
> >
> > On Thu, Jan 5, 2017 at 10:25 AM, rajeshbabu@apache.org <
> > chrajeshbabu32@gmail.com> wrote:
> >
> > > Dropping local index index issues scan and in the coprocessors we
> prepare
> > > index data and write back to same table.  To avoid the timeout you can
> > > increase scanner timeout/rpc timeout and phoenix client timeout values
> as
> > > well.
> > >
> > > If you have many guideposts then we might issue many parallel scans to
> > the
> > > same RS which might have added lot of overhead.
> > > Try dropping the index with less number of guideposts on the table.
> > >
> > > Actually during major compaction we can automatically skip writing back
> > the
> > > deleted index details so we can just drop meta data. I am checking this
> > > improvement. Raised https://issues.apache.org/jira/browse/PHOENIX-3566
> > for
> > > the same.
> > >
> > > Thanks,
> > > Rajeshbabu.
> > >
> > > On Thu, Jan 5, 2017 at 7:16 PM, 김영우 (YoungWoo Kim) <yw...@apache.org>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > I've faced crash of RS when I'm dropping local index from very large
> > > table.
> > > > I have several tables with billions of row and the tables are hosted
> on
> > > > HBase 1.2.4 and Phoenix 4.9.0
> > > >
> > > > Each region server reported 'gc pause' from their logs and it's
> strange
> > > to
> > > > me because, the log messages are popped up when I drop local indexes.
> > > > eventually, my query failed with timeout. any other operations like
> > > > insert/bulkload data or querying data works fine.
> > > >
> > > > I'm not sure but dropping local indexes from large table is
> expensive.
> > Is
> > > > there any workaround for this?
> > > >
> > > > Thanks,
> > > > Youngwoo
> > > >
> > >
> >
>

Re: Dropping local index on large table caused crash of region servers

Posted by Samarth Jain <sa...@gmail.com>.

That is correct. Renew lease feature is available starting Phoenix 4.7.0+
and HBase 1.1.3+. Youngwoo, it would be great if you can provide us
stacktraces of the timeout exceptions you are seeing.

On Thu, Jan 5, 2017 at 10:37 AM, James Taylor <ja...@apache.org>
wrote:

> It should not be necessary to increase the RPC timeout when dropping an
> index with the renew lease changes we made (Phoenix 4.7.0+ and HBase
> 1.1.3+). If it still is, please file a JIRA.
>
> Samarth - can you confirm the version requirements?
>
> Also - take a look at PHOENIX-3262. It'd be an interesting approach to
> dropping a local index.
>
> Thanks,
> James
>
> On Thu, Jan 5, 2017 at 10:25 AM, rajeshbabu@apache.org <
> chrajeshbabu32@gmail.com> wrote:
>
> > Dropping local index index issues scan and in the coprocessors we prepare
> > index data and write back to same table.  To avoid the timeout you can
> > increase scanner timeout/rpc timeout and phoenix client timeout values as
> > well.
> >
> > If you have many guideposts then we might issue many parallel scans to
> the
> > same RS which might have added lot of overhead.
> > Try dropping the index with less number of guideposts on the table.
> >
> > Actually during major compaction we can automatically skip writing back
> the
> > deleted index details so we can just drop meta data. I am checking this
> > improvement. Raised https://issues.apache.org/jira/browse/PHOENIX-3566
> for
> > the same.
> >
> > Thanks,
> > Rajeshbabu.
> >
> > On Thu, Jan 5, 2017 at 7:16 PM, 김영우 (YoungWoo Kim) <yw...@apache.org>
> > wrote:
> >
> > > Hi,
> > >
> > > I've faced crash of RS when I'm dropping local index from very large
> > table.
> > > I have several tables with billions of row and the tables are hosted on
> > > HBase 1.2.4 and Phoenix 4.9.0
> > >
> > > Each region server reported 'gc pause' from their logs and it's strange
> > to
> > > me because, the log messages are popped up when I drop local indexes.
> > > eventually, my query failed with timeout. any other operations like
> > > insert/bulkload data or querying data works fine.
> > >
> > > I'm not sure but dropping local indexes from large table is expensive.
> Is
> > > there any workaround for this?
> > >
> > > Thanks,
> > > Youngwoo
> > >
> >
>

Re: Dropping local index on large table caused crash of region servers

Posted by James Taylor <ja...@apache.org>.

It should not be necessary to increase the RPC timeout when dropping an
index with the renew lease changes we made (Phoenix 4.7.0+ and HBase
1.1.3+). If it still is, please file a JIRA.

Samarth - can you confirm the version requirements?

Also - take a look at PHOENIX-3262. It'd be an interesting approach to
dropping a local index.

Thanks,
James

On Thu, Jan 5, 2017 at 10:25 AM, rajeshbabu@apache.org <
chrajeshbabu32@gmail.com> wrote:

> Dropping local index index issues scan and in the coprocessors we prepare
> index data and write back to same table.  To avoid the timeout you can
> increase scanner timeout/rpc timeout and phoenix client timeout values as
> well.
>
> If you have many guideposts then we might issue many parallel scans to the
> same RS which might have added lot of overhead.
> Try dropping the index with less number of guideposts on the table.
>
> Actually during major compaction we can automatically skip writing back the
> deleted index details so we can just drop meta data. I am checking this
> improvement. Raised https://issues.apache.org/jira/browse/PHOENIX-3566 for
> the same.
>
> Thanks,
> Rajeshbabu.
>
> On Thu, Jan 5, 2017 at 7:16 PM, 김영우 (YoungWoo Kim) <yw...@apache.org>
> wrote:
>
> > Hi,
> >
> > I've faced crash of RS when I'm dropping local index from very large
> table.
> > I have several tables with billions of row and the tables are hosted on
> > HBase 1.2.4 and Phoenix 4.9.0
> >
> > Each region server reported 'gc pause' from their logs and it's strange
> to
> > me because, the log messages are popped up when I drop local indexes.
> > eventually, my query failed with timeout. any other operations like
> > insert/bulkload data or querying data works fine.
> >
> > I'm not sure but dropping local indexes from large table is expensive. Is
> > there any workaround for this?
> >
> > Thanks,
> > Youngwoo
> >
>

Re: Dropping local index on large table caused crash of region servers

Posted by "rajeshbabu@apache.org" <ch...@gmail.com>.

Dropping local index index issues scan and in the coprocessors we prepare
index data and write back to same table.  To avoid the timeout you can
increase scanner timeout/rpc timeout and phoenix client timeout values as
well.

If you have many guideposts then we might issue many parallel scans to the
same RS which might have added lot of overhead.
Try dropping the index with less number of guideposts on the table.

Actually during major compaction we can automatically skip writing back the
deleted index details so we can just drop meta data. I am checking this
improvement. Raised https://issues.apache.org/jira/browse/PHOENIX-3566 for
the same.

Thanks,
Rajeshbabu.

On Thu, Jan 5, 2017 at 7:16 PM, 김영우 (YoungWoo Kim) <yw...@apache.org> wrote:

> Hi,
>
> I've faced crash of RS when I'm dropping local index from very large table.
> I have several tables with billions of row and the tables are hosted on
> HBase 1.2.4 and Phoenix 4.9.0
>
> Each region server reported 'gc pause' from their logs and it's strange to
> me because, the log messages are popped up when I drop local indexes.
> eventually, my query failed with timeout. any other operations like
> insert/bulkload data or querying data works fine.
>
> I'm not sure but dropping local indexes from large table is expensive. Is
> there any workaround for this?
>
> Thanks,
> Youngwoo
>