You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Aleksandr Saraseka <as...@eztexting.com> on 2019/08/09 14:21:26 UTC

Phoenix Index Scrutiny Tool

Hello community!
I'm testing scrutiny tool to check index consistency.
I hard-deleted from HBase a couple of rows from global index, then ran
Scrutiny tool, it showed me some output like:


SOURCE_TABLE

TARGET_TABLE

SCRUNITY_EXECUTE_TIME

SOURCE_ROW_PK_HASH

SOURCE_TS

TARGET_TS

HAS_TARGET_ROW

INDEX_TABLE

DATA_TABLE

1565358267566

8a74d1f8286a7ec7ce99b22ee0723ab1

1565358171998

-1

false

INDEX_TABLE

DATA_TABLE

1565358267566

a2cfe11952f3701d340069f80e2a82b7

1565358135292

-1

false

so, let's imagine that I want to repair my index and don't want to run full
rebuild (huge table).

What's the best option ?

Two things came to my mind:

- Find a row in data table, and upset necessary data to index table.

- Find a row in data table,  export it then drop it, and then insert it
again.

And the main question - how can I get a value from data or index table by
Primary Key hash ?


-- 
Aleksandr Saraseka
DBA
380997600401
 *•*  asaraseka@eztexting.com  *•*  eztexting.com
<http://eztexting.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<http://facebook.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<http://linkedin.com/company/eztexting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<http://twitter.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.youtube.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.instagram.com/ez_texting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.facebook.com/alex.saraseka?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>

Re: Phoenix Index Scrutiny Tool

Posted by Aleksandr Saraseka <as...@eztexting.com>.
upd: yeah, my first example of output seems to be wrong.
Re-run my test and now seems right:
+----------------------------------------------------------------------------------------------+-------------------------------------------+------------------------+------------------------+
|                                         SOURCE_TABLE
                    |               TARGET_TABLE                |
SCRUTINY_EXECUTE_TIME  |        SOURCE_ROW_PK_H |
+----------------------------------------------------------------------------------------------+-------------------------------------------+------------------------+------------------------+
| "ALEX"."TEST"
                   | "ALEX"."GLOBAL_DATA1_DATA2_INC_DATA3"     |
1565689347373          | 3e3aa2997d0bbf9b4d7247 |

On Tue, Aug 13, 2019 at 12:43 PM Aleksandr Saraseka <as...@eztexting.com>
wrote:

> @Vincent - I dropped columns from an index table.
> Thank you for pointing me to "partial rebuild", will try it.
>
> On Mon, Aug 12, 2019 at 8:51 PM Vincent Poon <vi...@apache.org>
> wrote:
>
>> @Aleksandr did you delete rows from the data table, or the index table?
>> The output you're showing says that you have orphaned rows in the index
>> table - i.e. rows that exist only in the index table and have no
>> corresponding row in the data table.  If you deleted the original rows in
>> the data table without deleting the corresponding rows in the index table,
>> and if major compaction has happened (perhaps what you meant by
>> "hard-delete" ?), then in general there's no way to rebuild the index
>> correctly, as there's no source data to work off of.   You might have
>> special cases where you have an index that covers all the data table rows
>> such that you could in theory go backwards, but I don't believe we have any
>> tool to do that yet.
>>
>> The IndexTool does have a "partial rebuild" option that works in
>> conjunction with "ALTER INDEX REBUILD ASYNC" - see PHOENIX-2890.  However
>> this is not well documented, and I haven't personally tried it myself.
>>
>> On Fri, Aug 9, 2019 at 2:18 PM Alexander Batyrshin <0x...@gmail.com>
>> wrote:
>>
>>> I have familiar question - how to partially rebuild indexes by
>>> timestamps interval like many MapReduce has —starttime/—endttime
>>>
>>> On 9 Aug 2019, at 17:21, Aleksandr Saraseka <as...@eztexting.com>
>>> wrote:
>>>
>>> Hello community!
>>> I'm testing scrutiny tool to check index consistency.
>>> I hard-deleted from HBase a couple of rows from global index, then ran
>>> Scrutiny tool, it showed me some output like:
>>>
>>>
>>> SOURCE_TABLE
>>>
>>> TARGET_TABLE
>>>
>>> SCRUNITY_EXECUTE_TIME
>>>
>>> SOURCE_ROW_PK_HASH
>>>
>>> SOURCE_TS
>>>
>>> TARGET_TS
>>>
>>> HAS_TARGET_ROW
>>>
>>> INDEX_TABLE
>>>
>>> DATA_TABLE
>>>
>>> 1565358267566
>>>
>>> 8a74d1f8286a7ec7ce99b22ee0723ab1
>>>
>>> 1565358171998
>>>
>>> -1
>>>
>>> false
>>>
>>> INDEX_TABLE
>>>
>>> DATA_TABLE
>>>
>>> 1565358267566
>>>
>>> a2cfe11952f3701d340069f80e2a82b7
>>>
>>> 1565358135292
>>>
>>> -1
>>>
>>> false
>>>
>>> so, let's imagine that I want to repair my index and don't want to run
>>> full rebuild (huge table).
>>>
>>> What's the best option ?
>>>
>>> Two things came to my mind:
>>>
>>> - Find a row in data table, and upset necessary data to index table.
>>>
>>> - Find a row in data table,  export it then drop it, and then insert it
>>> again.
>>>
>>> And the main question - how can I get a value from data or index table
>>> by Primary Key hash ?
>>>
>>>
>>> --
>>> Aleksandr Saraseka
>>> DBA
>>> 380997600401
>>>  *•*  asaraseka@eztexting.com  *•*  eztexting.com
>>> <http://eztexting.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>
>>> <http://facebook.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>> <http://linkedin.com/company/eztexting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>> <http://twitter.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>> <https://www.youtube.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>> <https://www.instagram.com/ez_texting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>> <https://www.facebook.com/alex.saraseka?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>> <https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>>
>>>
>>>
>
> --
> Aleksandr Saraseka
> DBA
> 380997600401
>  *•*  asaraseka@eztexting.com  *•*  eztexting.com
> <http://eztexting.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>
> <http://facebook.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <http://linkedin.com/company/eztexting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <http://twitter.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <https://www.youtube.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <https://www.instagram.com/ez_texting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <https://www.facebook.com/alex.saraseka?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>


-- 
Aleksandr Saraseka
DBA
380997600401
 *•*  asaraseka@eztexting.com  *•*  eztexting.com
<http://eztexting.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<http://facebook.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<http://linkedin.com/company/eztexting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<http://twitter.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.youtube.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.instagram.com/ez_texting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.facebook.com/alex.saraseka?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>

Re: Phoenix Index Scrutiny Tool

Posted by Aleksandr Saraseka <as...@eztexting.com>.
@Vincent - I dropped columns from an index table.
Thank you for pointing me to "partial rebuild", will try it.

On Mon, Aug 12, 2019 at 8:51 PM Vincent Poon <vi...@apache.org> wrote:

> @Aleksandr did you delete rows from the data table, or the index table?
> The output you're showing says that you have orphaned rows in the index
> table - i.e. rows that exist only in the index table and have no
> corresponding row in the data table.  If you deleted the original rows in
> the data table without deleting the corresponding rows in the index table,
> and if major compaction has happened (perhaps what you meant by
> "hard-delete" ?), then in general there's no way to rebuild the index
> correctly, as there's no source data to work off of.   You might have
> special cases where you have an index that covers all the data table rows
> such that you could in theory go backwards, but I don't believe we have any
> tool to do that yet.
>
> The IndexTool does have a "partial rebuild" option that works in
> conjunction with "ALTER INDEX REBUILD ASYNC" - see PHOENIX-2890.  However
> this is not well documented, and I haven't personally tried it myself.
>
> On Fri, Aug 9, 2019 at 2:18 PM Alexander Batyrshin <0x...@gmail.com>
> wrote:
>
>> I have familiar question - how to partially rebuild indexes by
>> timestamps interval like many MapReduce has —starttime/—endttime
>>
>> On 9 Aug 2019, at 17:21, Aleksandr Saraseka <as...@eztexting.com>
>> wrote:
>>
>> Hello community!
>> I'm testing scrutiny tool to check index consistency.
>> I hard-deleted from HBase a couple of rows from global index, then ran
>> Scrutiny tool, it showed me some output like:
>>
>>
>> SOURCE_TABLE
>>
>> TARGET_TABLE
>>
>> SCRUNITY_EXECUTE_TIME
>>
>> SOURCE_ROW_PK_HASH
>>
>> SOURCE_TS
>>
>> TARGET_TS
>>
>> HAS_TARGET_ROW
>>
>> INDEX_TABLE
>>
>> DATA_TABLE
>>
>> 1565358267566
>>
>> 8a74d1f8286a7ec7ce99b22ee0723ab1
>>
>> 1565358171998
>>
>> -1
>>
>> false
>>
>> INDEX_TABLE
>>
>> DATA_TABLE
>>
>> 1565358267566
>>
>> a2cfe11952f3701d340069f80e2a82b7
>>
>> 1565358135292
>>
>> -1
>>
>> false
>>
>> so, let's imagine that I want to repair my index and don't want to run
>> full rebuild (huge table).
>>
>> What's the best option ?
>>
>> Two things came to my mind:
>>
>> - Find a row in data table, and upset necessary data to index table.
>>
>> - Find a row in data table,  export it then drop it, and then insert it
>> again.
>>
>> And the main question - how can I get a value from data or index table by
>> Primary Key hash ?
>>
>>
>> --
>> Aleksandr Saraseka
>> DBA
>> 380997600401
>>  *•*  asaraseka@eztexting.com  *•*  eztexting.com
>> <http://eztexting.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>
>> <http://facebook.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>> <http://linkedin.com/company/eztexting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>> <http://twitter.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>> <https://www.youtube.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>> <https://www.instagram.com/ez_texting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>> <https://www.facebook.com/alex.saraseka?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>> <https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>>
>>
>>

-- 
Aleksandr Saraseka
DBA
380997600401
 *•*  asaraseka@eztexting.com  *•*  eztexting.com
<http://eztexting.com?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<http://facebook.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<http://linkedin.com/company/eztexting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<http://twitter.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.youtube.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.instagram.com/ez_texting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.facebook.com/alex.saraseka?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
<https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>

Re: Phoenix Index Scrutiny Tool

Posted by Vincent Poon <vi...@apache.org>.
@Aleksandr did you delete rows from the data table, or the index table?
The output you're showing says that you have orphaned rows in the index
table - i.e. rows that exist only in the index table and have no
corresponding row in the data table.  If you deleted the original rows in
the data table without deleting the corresponding rows in the index table,
and if major compaction has happened (perhaps what you meant by
"hard-delete" ?), then in general there's no way to rebuild the index
correctly, as there's no source data to work off of.   You might have
special cases where you have an index that covers all the data table rows
such that you could in theory go backwards, but I don't believe we have any
tool to do that yet.

The IndexTool does have a "partial rebuild" option that works in
conjunction with "ALTER INDEX REBUILD ASYNC" - see PHOENIX-2890.  However
this is not well documented, and I haven't personally tried it myself.

On Fri, Aug 9, 2019 at 2:18 PM Alexander Batyrshin <0x...@gmail.com>
wrote:

> I have familiar question - how to partially rebuild indexes by timestamps
> interval like many MapReduce has —starttime/—endttime
>
> On 9 Aug 2019, at 17:21, Aleksandr Saraseka <as...@eztexting.com>
> wrote:
>
> Hello community!
> I'm testing scrutiny tool to check index consistency.
> I hard-deleted from HBase a couple of rows from global index, then ran
> Scrutiny tool, it showed me some output like:
>
>
> SOURCE_TABLE
>
> TARGET_TABLE
>
> SCRUNITY_EXECUTE_TIME
>
> SOURCE_ROW_PK_HASH
>
> SOURCE_TS
>
> TARGET_TS
>
> HAS_TARGET_ROW
>
> INDEX_TABLE
>
> DATA_TABLE
>
> 1565358267566
>
> 8a74d1f8286a7ec7ce99b22ee0723ab1
>
> 1565358171998
>
> -1
>
> false
>
> INDEX_TABLE
>
> DATA_TABLE
>
> 1565358267566
>
> a2cfe11952f3701d340069f80e2a82b7
>
> 1565358135292
>
> -1
>
> false
>
> so, let's imagine that I want to repair my index and don't want to run
> full rebuild (huge table).
>
> What's the best option ?
>
> Two things came to my mind:
>
> - Find a row in data table, and upset necessary data to index table.
>
> - Find a row in data table,  export it then drop it, and then insert it
> again.
>
> And the main question - how can I get a value from data or index table by
> Primary Key hash ?
>
>
> --
> Aleksandr Saraseka
> DBA
> 380997600401
>  *•*  asaraseka@eztexting.com  *•*  eztexting.com
> <http://eztexting.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>
> <http://facebook.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <http://linkedin.com/company/eztexting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <http://twitter.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <https://www.youtube.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <https://www.instagram.com/ez_texting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <https://www.facebook.com/alex.saraseka?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
> <https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>
>
>
>

Re: Phoenix Index Scrutiny Tool

Posted by Alexander Batyrshin <0x...@gmail.com>.
I have familiar question - how to partially rebuild indexes by timestamps interval like many MapReduce has —starttime/—endttime

> On 9 Aug 2019, at 17:21, Aleksandr Saraseka <as...@eztexting.com> wrote:
> 
> Hello community!
> I'm testing scrutiny tool to check index consistency.
> I hard-deleted from HBase a couple of rows from global index, then ran Scrutiny tool, it showed me some output like:
> 
> 
> SOURCE_TABLE
> TARGET_TABLE
> SCRUNITY_EXECUTE_TIME
> SOURCE_ROW_PK_HASH
> SOURCE_TS
> TARGET_TS
> HAS_TARGET_ROW
> INDEX_TABLE
> DATA_TABLE
> 1565358267566
> 8a74d1f8286a7ec7ce99b22ee0723ab1
> 1565358171998
> -1
> false
> INDEX_TABLE
> DATA_TABLE
> 1565358267566
> a2cfe11952f3701d340069f80e2a82b7
> 1565358135292
> -1
> false
> so, let's imagine that I want to repair my index and don't want to run full rebuild (huge table).
> What's the best option ?
> Two things came to my mind:
> - Find a row in data table, and upset necessary data to index table.
> - Find a row in data table,  export it then drop it, and then insert it again.
> And the main question - how can I get a value from data or index table by Primary Key hash ?
> 
> -- 
>  		Aleksandr Saraseka
> DBA
> 380997600401
>  <tel:380997600401> •  asaraseka@eztexting.com <ma...@eztexting.com>  •  eztexting.com <http://eztexting.com/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>  
>  <http://facebook.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>	 <http://linkedin.com/company/eztexting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>	 <http://twitter.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>	 <https://www.youtube.com/eztexting?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>	 <https://www.instagram.com/ez_texting/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>	 <https://www.facebook.com/alex.saraseka?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>	 <https://www.linkedin.com/in/alexander-saraseka-32616076/?utm_source=WiseStamp&utm_medium=email&utm_term=&utm_content=&utm_campaign=signature>