You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Suresh Subbiah (JIRA)" <ji...@apache.org> on 2015/07/27 13:05:04 UTC

[jira] [Created] (TRAFODION-1424) Enable CIF (compressed internal format) for Trafodion scan operator

Suresh Subbiah created TRAFODION-1424:
-----------------------------------------

             Summary: Enable CIF (compressed internal format) for Trafodion scan operator
                 Key: TRAFODION-1424
                 URL: https://issues.apache.org/jira/browse/TRAFODION-1424
             Project: Apache Trafodion
          Issue Type: Improvement
          Components: sql-cmp, sql-exe
    Affects Versions: 0.6 (pre-incubation)
            Reporter: Suresh Subbiah
            Assignee: Suresh Subbiah


When varchar data is read from a Trafodion table by the scan operator it is converted to exploded format by blank padding data to its maximum declared length. A change contributed by Khaled Bouaziz extends Trafodion's compressed internal format (CIF) feature to now include the Traf scan operator.

Here is an email exchange with Khaled on this subject

Hi Anoop:

I made few changes to use CIF with traf scan 

The changes are mostly for the the convert expression:
-	Changed the row format of the convert row to aligned format
-	Added code to use cif so that we only use the needed space

If we can store the CIF row length with row when we insert/update that we can bypass the convert expression I think.





>>select * from testcif;   

A            B
-----------  ----------------------------------------------------------------------------------------------------

          1  aaa                                                                                                 
          1  bbbbb                                                                                               
          1  ccccccccccccc                                                                                       
          1  eeeeeeeeeeeeeeeeeeeeeeeeeeeeee                                                                      

--- 4 row(s) selected.
>>fc                  

Expression: Convert Expr
Expr Len: 632, Consts Len: 8
flags_ = 0000000010001000   
  Clause #1: ex_function_clause
    OperatorTypeEnum = ITM_HEADER(2375), NumOperands = 1
    ex_clause::flags_ = 0000000010000000                
    ex_function_clause::flags_ = 0000000000000000       
    PCODE  = supported                                  
    Operand #0 (result):                                
      Datatype = REC_BYTE_F_ASCII(0), Length = 16, Null Flag = 0
      Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
      Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
      Atp = 1, AtpIndex = 2                                             
      Offset = 0, NullIndOffset = -1, VClenIndOffset = -1               
      RelOffset = 0, VoaOffset = -1, NullBitIdx = -1                    
      NullIndLength = 0, VClenIndLength = 0                             
      ValueId = 0                                                       
      Text = Hdr                                                        

  Clause #2: ex_conv_clause
    OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
    ex_clause::flags_ = 0000000010000000              
    ex_conv_clause::flags_ = 0000000000000000         
    PCODE  = supported                                
    Operand #0 (result):                              
      Datatype = REC_BIN64_SIGNED(134), Length = 8, Null Flag = 0
      Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001001
      Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
      Atp = 1, AtpIndex = 2                                             
      Offset = 16, NullIndOffset = -1, VClenIndOffset = -1              
      RelOffset = 0, VoaOffset = -1, NullBitIdx = -1                    
      NullIndLength = 0, VClenIndLength = 0                             
      ValueId = 30                                                      
      Text = cast                                                       

    Operand #1:
      Datatype = REC_BIN64_SIGNED(134), Length = 8, Null Flag = 0
      Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001001
      Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
      Atp = 1, AtpIndex = 4                                             
      Offset = 16, NullIndOffset = -1, VClenIndOffset = -1              
      RelOffset = 0, VoaOffset = -1, NullBitIdx = -1                    
      NullIndLength = 0, VClenIndLength = 0                             
      ValueId = 29                                                      
      Text = LARGEINT                                                   

  Clause #3: ex_conv_clause
    OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
    ex_clause::flags_ = 0000000010000110              
    ex_conv_clause::flags_ = 0000000000000000         
    PCODE  = supported                                
    Operand #0 (result):                              
      Datatype = REC_BIN32_SIGNED(132), Length = 4, Null Flag = 1
      Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001001
      Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
      Atp = 1, AtpIndex = 2                                             
      Offset = 24, NullIndOffset = 12, VClenIndOffset = -1              
      RelOffset = 8, VoaOffset = -1, NullBitIdx = 0                     
      NullIndLength = 0, VClenIndLength = 0                             
      ValueId = 32                                                      
      Text = cast                                                       

    Operand #1:
      Datatype = REC_BIN32_SIGNED(132), Length = 4, Null Flag = 1
      Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001001
      Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
      Atp = 1, AtpIndex = 4                                             
      Offset = 24, NullIndOffset = 12, VClenIndOffset = -1              
      RelOffset = 8, VoaOffset = -1, NullBitIdx = 0                     
      NullIndLength = 0, VClenIndLength = 0                             
      ValueId = 31                                                      
      Text = INTEGER SIGNED                                             

  Clause #4: ex_conv_clause
    OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
    ex_clause::flags_ = 0000000010000110              
    ex_conv_clause::flags_ = 0000000000000000         
    PCODE  = supported                                
    Operand #0 (result):                              
      Datatype = REC_BYTE_V_ASCII(64), Length = 100, Null Flag = 1
      Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
      Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
      Atp = 1, AtpIndex = 2                                             
      Offset = 32, NullIndOffset = 12, VClenIndOffset = 28              
      RelOffset = 0, VoaOffset = 8, NullBitIdx = 1                      
      NullIndLength = 0, VClenIndLength = 4                             
      ValueId = 34                                                      
      Text = cast                                                       

    Operand #1:
      Datatype = REC_BYTE_V_ASCII(64), Length = 100, Null Flag = 1
      Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
      Tuple Data Format = SQLMX_ALIGNED_FORMAT                          
      Atp = 1, AtpIndex = 4                                             
      Offset = 32, NullIndOffset = 12, VClenIndOffset = 28              
      RelOffset = 0, VoaOffset = 8, NullBitIdx = 1                      
      NullIndLength = 0, VClenIndLength = 4                             
      ValueId = 33                                                      
      Text = VARCHAR(100) CHARACTER SET ISO88591                        

  PCode:

PCode Expr Length: 332
    [1]               
    HDR_MPTR32_IBIN32S_IBIN32S_IBIN32S_IBIN32S_IBIN32S (303) 4 0 16 16 4 12 4
    MOVE_MBIN64S_MBIN64S (203) 4 16 5 16                                     
    NOT_NULL_BRANCH_MBIN32S_MBIN32S_IATTR3_IBIN32S (248) 4 12 5 12 134219270 0 0 14  (Tgt: 3)

    [2]
    MOVE_MBIN32S_IBIN32S (3) 4 24 0
    BRANCH (95) 6  (Tgt: 4)        

    [3]  (Preds: 1 )
    MOVE_MBIN32U_MBIN32U (202) 4 24 5 24

    [4]  (Preds: 2 )
    NOT_NULL_BRANCH_MBIN32S_MBIN32S_IATTR3_IBIN32S (248) 4 12 5 12 134219270 1 1 24  (Tgt: 6)

    [5]
    FILL_MEM_BYTES_VARIABLE (317) 4 32 8 100 1024 0 0
    UPDATE_ROWLEN3_MATTR5_IBIN32S (316) 4 32 8 -1 1024 4
    RETURN (264)                                        

    [6]  (Preds: 4 )
    MOVE_MATTR5_MATTR5 (284) 4 32 8 100 1024 5 32 8 100 1024
    UPDATE_ROWLEN3_MATTR5_IBIN32S (316) 4 32 8 -1 1024 4    
    RETURN (264)                                            

Expression: ScanExpr is NULL

Expression: RowIdExpr
Expr Len: 384, Consts Len: 8
flags_ = 0000000010001000   
  Clause #1: ex_conv_clause 
    OperatorTypeEnum = ITM_CAST(2452), NumOperands = 2
    ex_clause::flags_ = 0000000010000000              
    ex_conv_clause::flags_ = 0000000000010000         
    PCODE  = supported                                
    Operand #0 (result):                              
      Datatype = REC_BIN64_SIGNED(134), Length = 8, Null Flag = 0
      Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001000
      Tuple Data Format = SQLARK_EXPLODED_FORMAT                        
      Atp = 0, AtpIndex = 1 (Temporary)                                 
      Offset = 0, NullIndOffset = -1, VClenIndOffset = -1               
      NullIndLength = 0, VClenIndLength = 0                             
      ValueId = 36                                                      
      Text = cast                                                       

    Operand #1:
      Datatype = REC_BYTE_V_ASCII(64), Length = 8, Null Flag = 0
      Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
      Tuple Data Format = SQLARK_EXPLODED_FORMAT                        
      Atp = 1, AtpIndex = 5                                             
      Offset = 2, NullIndOffset = -1, VClenIndOffset = 0                
      NullIndLength = 0, VClenIndLength = 2                             
      ValueId = 35                                                      
      Text = VARCHAR(8) CHARACTER SET ISO88591                          

  Clause #2: ex_function_clause
    OperatorTypeEnum = ITM_COMP_ENCODE(2114), NumOperands = 2
    ex_clause::flags_ = 0000000010000000                     
    ex_function_clause::flags_ = 0000000000000000            
    PCODE  = supported                                       
    Operand #0 (result):                                     
      Datatype = REC_BYTE_F_ASCII(0), Length = 8, Null Flag = 0
      Precision = 0, Scale = 1, Collation = 1, flags_ = 0000001000001000
      Tuple Data Format = SQLARK_EXPLODED_FORMAT                        
      Atp = 1, AtpIndex = 3                                             
      Offset = 0, NullIndOffset = -1, VClenIndOffset = -1               
      NullIndLength = 0, VClenIndLength = 0                             
      ValueId = 37                                                      
      Text = comp_encode                                                

    Operand #1:
      Datatype = REC_BIN64_SIGNED(134), Length = 8, Null Flag = 0
      Precision = 0, Scale = 0, Collation = 1, flags_ = 0000001000001000
      Tuple Data Format = SQLARK_EXPLODED_FORMAT                        
      Atp = 0, AtpIndex = 1 (Temporary)                                 
      Offset = 0, NullIndOffset = -1, VClenIndOffset = -1               
      NullIndLength = 0, VClenIndLength = 0                             
      ValueId = 36                                                      
      Text = cast                                                       

  PCode:

PCode Expr Length: 88
    [1]              
    CONVVCPTR_MBIN64S_MATTR5_IBIN32S (330) 2 0 5 2 -1 8 512 8
    ENCODE_MASCII_MBIN64S_IBIN32S (91) 4 0 2 0 0             
    RETURN (264)                                             

Expression: UpdateExpr is NULL

Expression: MergeInsertExpr is NULL

Expression: LowKeyExpr is NULL

Expression: HighKeyExpr is NULL

Expression: ReturnFetchExpr is NULL

Expression: ReturnUpdateExpr is NULL

Expression: ReturnMergeInsertExpr is NULL

Expression: mergeUpdScanExpr is NULL

Expression: mergeInsertRowIdExpr is NULL

Expression: encodedKeyExpr is NULL

Expression: keyColValExpr is NULL

Expression: hbaseFilterExpr is NULL



From: Sharma, Anoop 
Sent: Tuesday, April 21, 2015 10:04 AM
To: Bouaziz, Khaled; Subbiah, Suresh
Subject: CIF question

hi
  do we use CIF when selecting out of traf scan operator?

for ex:
  create table t (a int not null primary key, b varchar(1000), c varchar(1000));

would “select * from t” use CIF for the output row from scan operator?

Right now I see exploded format in HbaseAccess even if I set cqd compressed_internal_format to ON?

But if I do a join, then the hash join operator uses CIF but scan operator uses exploded.

When is the conversion from exploded to aligned(cif) format done?

anoop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)