You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/06/17 16:57:00 UTC

[jira] [Commented] (KUDU-3483) Flush multiple data in AUTO_FLUSH_BACKGROUND mode maybe fail when the table schema has changed

    [ https://issues.apache.org/jira/browse/KUDU-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733790#comment-17733790 ] 

ASF subversion and git services commented on KUDU-3483:
-------------------------------------------------------

Commit fdd373ab1be38f4eeb65360661ec039c5ab4a572 in kudu's branch refs/heads/master from xinghuayu007
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=fdd373ab1 ]

[KUDU-3483] Log requestor info and request id for write interface

Currently, it is hard to trace the whole process of inserting data.
Write interface logs the data, but doesn't log the requestor information
and client information.

This patch logs the requestor information about downstream IP and
client information about client id, sequence number for debuging.

Change-Id: I0a3bacd947146fb52f94ab09dda94c96ed6d8ab8
Reviewed-on: http://gerrit.cloudera.org:8080/19950
Tested-by: Kudu Jenkins
Reviewed-by: Yingchun Lai <la...@apache.org>


> Flush multiple data in AUTO_FLUSH_BACKGROUND mode maybe fail when the table schema has changed
> ----------------------------------------------------------------------------------------------
>
>                 Key: KUDU-3483
>                 URL: https://issues.apache.org/jira/browse/KUDU-3483
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Xixu Wang
>            Priority: Major
>         Attachments: image-2023-05-30-16-12-20-361.png
>
>
>  
> *1.The problem*
> Flush multiple data in auto_flush_background mode maybe fail when the table schema has changed. The following is the error message:
> !image-2023-05-30-16-12-20-361.png!
>  
> *2.How to repeat the case*
> 1.create a table with 2 columns.
> 2.insert a data into this table in auto_flush_background mode.
> 3.Add 3 new columns for this table.
> 4.reopen this table
> 5.insert a data into this table in auto_flush_background mode.
> 6.flush the buffer
> {code:java}
> KuduTable table = createTable(ImmutableList.of());
> // Add a row with addNullableDef=null    
> final KuduSession session = client.newSession();    
> session.setFlushMode(SessionConfiguration.FlushMode.AUTO_FLUSH_BACKGROUND);    
> Insert insert = table.newInsert();    
> PartialRow row = insert.getRow();    
> row.addInt("c0", 101);    
> row.addInt("c1", 101);    
> session.apply(insert);
> // Add some new columns.    
> client.alterTable(tableName, new AlterTableOptions()    
>   .addColumn("addNonNull", Type.INT32, 100)    
>   .addNullableColumn("addNullable", Type.INT32)    
>   .addNullableColumn("addNullableDef", Type.INT32, 200));
>     
> // Reopen table for the new schema.    
> table = client.openTable(tableName);    
> assertEquals(5, table.getSchema().getColumnCount());    
> Insert newinsert = table.newInsert();    
> PartialRow newrow = newinsert.getRow();    
> newrow.addInt("c0", 101);    
> newrow.addInt("c1", 101);    
> newrow.addInt("addNonNull", 101);    
> newrow.addInt("addNullable", 101);    
> newrow.setNull("addNullableDef");    
> session.apply(newinsert);    
> session.flush(); {code}
>  
> *3.Why this problem happened*
> In auto_flush_background mode, applying an operation will firstly be inserted into the buffer. When the buffer is full or function flush() is called, it will try to flush multiple data into Kudu server. First, it will group these data according to the tablet id as a batch. A batch may contains multiple rows which belong to the same tablet. Then a batch will encode into bytes. At this time, it will read the table schema of the first row and decide the format of the data. If two rows has different schema but belongs to the same table, which because of altering the table between inserting two rows, it will cause array index outbound exception.
>  
> By the way, it hard to trace the whole process, especially in kudu tablet server, it is better to log downstream IP and client id.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)