You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Alice Chen (JIRA)" <ji...@apache.org> on 2015/07/22 20:19:28 UTC

[jira] [Created] (TRAFODION-1026) LP Bug: 1426479 - Row mismatch between index/table cause init auth to fail

Alice Chen created TRAFODION-1026:
-------------------------------------

             Summary: LP Bug: 1426479 - Row mismatch between index/table cause init auth to fail
                 Key: TRAFODION-1026
                 URL: https://issues.apache.org/jira/browse/TRAFODION-1026
             Project: Apache Trafodion
          Issue Type: Bug
          Components: sql-exe
            Reporter: Roberta Marton
            Assignee: justin.du@hp.com
            Priority: Critical
             Fix For: 1.1 (pre-incubation)


Initialize authorization is failing because of a problem with index maintenance on the OBJECTS metadata table.

After running full regressions, the number of rows in the OBJECTS table is less than the number of rows in the OBJECTS table's index. The number of rows should match.

How to recreate:
   Run full regressions on the work station (either debug or release) – everything works
   Re-run regressions and all catman1 tests failed because initialize authorization fails:

initialize authorization;

*** ERROR[1001] An internal error occurred in module ../sqlcomp/PrivMgrPrivileges.cpp on line 3183.  DETAILS(Expected to insert 246 rows into OBJECT_PRIVILEGES table, instead 238 were found.).

The initialize authorization code first performs an insert .. select -> this returns 238 rows (it reads from the OBJECTS table)
Then the code performs a count(*) on the newly inserted table -> this returns 246 rows (it reads from the OBJECTS table index)

Note that initialize authorization creates five tables, and then performs the insert … select
When initialize authorization fails it removes the 5 tables.

If I log onto sqlci and perform similar commands:
   A select that mimics what the insert .. select performs and
  A select with count(*) 

select count(*)
       from "_MD_".objects o 
       where o.object_type in ('VI','BT','LB','UR','SG')
;

(EXPR)              
--------------------

                 241

--- 1 row(s) selected.

select 
       object_uid,
       object_owner   
    from "_MD_".objects o    
    where o.object_type in ('VI','BT','LB','UR','SG')
;

OBJECT_UID            OBJECT_OWNER
--------------------  ------------

   61418117531923345         33333
   82247271613094528         33333
. . .
  121090312342560989         33333
  121090312342560327         33333
  121090312342560715         33333

--- 233 row(s) selected.

Note that the difference in row counts in these queries and the row counts from initialize authorization is because the 5 privilege manager table that no longer exist (246 – 5 = 241 & 238 – 5 = 233)

If I don’t include object_owner in the select list, then 241 rows are returned as expected.

Comparing the returned UID’s from both requests, the following rows were not returned in the second select:

OBJECT_TYPE  SCHEMA_NAME                      OBJECT_NAME
-----------  -------------------------------  --------------------------------------------------

BT           SCH                              SKC                                                                                                                                                                                                                                                             
BT           SCH                              DM2C                                                                                                                                                                                                                                                            
BT           SCH                              T062A                                                                                                                                                                                                                                                           
BT           SCH                              T062B                                                                                                                                                                                                                                                           
BT           SCH                              SKC                                                                                                                                                                                                                                                             
BT           SCH                              DM2C                                                                                                                                                                                                                                                            
BT           SCH                              T062A                                                                                                                                                                                                                                                           
BT           SCH                              T062B                                                                                                                                                                                                                                                           

--- 8 row(s) selected.

The returned rows have UID’s in the same range:

  83939089253157639
   83939089253158296
   83939089253164783
   83939089253164911
   86190385130448198
   86190385130448858
   86190385130453959
   86190385130454168

Now I did a select count(*) on the table versus the index (read all the rows, not just a subset):

Set parserflags 1;
select count(*) from table (index_table objects_uniq_idx);
..

(EXPR)              
--------------------

                 448

--- 1 row(s) selected.
>>select count(*) from objects;

(EXPR)              
--------------------

                 436

There is a difference – there are more rows in the index than in the table.  This seems to be the problem encountered by initialize authorization.

Hopefully, the table names above can lead to which statements are actually failing to do index maintenance and the problem can be more easily recreated.

------> Mike

All the tables below come from compGeneral/TEST062. I ran it by itself and then checked for corruption but I didn’t find it, so this might be a little harder to debug. 


ps - What’s the significance of the UIDs being in the same range? 

------> Selva

This might be yet another issue that we might need to understand better. Couple of days ago, I ran executor/TEST106 in my workspace because jenkens had some failures with this test. For some reason, this left the metadata for the table TEST106 inconsistent though the test passed. I couldn’t drop T106B (I think) because it was not able to delete all the rows from SB_HISTOGRAM_INTERVALS for some reason and drop used to get stuck there. However, I didn’t find this table in hbase. 

I initialized Trafodion again and ran full regressions again. This time similar problem happens in T42 table in compGeneral/TEST011.


------> Sandhya

This is troubling though – I don’t recall this issue being reported where our dev regressions left the metadata in inconsistent state so easily . We recently made some changes that changed the plan for the delete statement that gets executed during a drop table – it was to avoid an error 73 issue when concurrent ddls were performed. I wonder if there is some issue with that  CQD and the plan causing index maintenance on the metadata tables  to not work correctly.  Or perhaps some other change causing this regression. I am just guessing here – no concrete clues. As Mike says this may be difficult to pin point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)