You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Wenzhe Zhou (Jira)" <ji...@apache.org> on 2021/03/17 00:30:00 UTC

[jira] [Comment Edited] (IMPALA-10564) No error returned when inserting an overflowed value into a decimal column

    [ https://issues.apache.org/jira/browse/IMPALA-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17302967#comment-17302967 ] 

Wenzhe Zhou edited comment on IMPALA-10564 at 3/17/21, 12:29 AM:
-----------------------------------------------------------------

Since we write column by column when writing a row, we have to rewind the table writer for partially wrote row if we want to skip a row with invalid column data.

Read the code of the table writers for different formats to confirm if we can rewind the writer for partially wrote row. It seems that it's not hard for Kudu, Text and HBase formats since they buffer row data before writing a row to the table. But it's really hard to find a good way for Parquet to skip a partially wrote row.

Kudu (KuduTableSink::Send() in kudu-table-sink.cc) create KuduWriteOperation object for each row and push the object into a vector after adding all columns. We could change the code not to push the KuduWriteOperation object to the vector if there is invalid data for one column.

The text table writer (HdfsTextTableWriter::AppendRows() in hdfs-text-table-writter.cc) use stringstream to buffer the row data. The stringstream itself could not be re-winded, but we could save ending offset of last row, and flush the stringstream up to the ending offset of last row if we get an invalid column value when writing a new row. Then reset stringstream for next row.

Although HBase is known to be a column oriented database (where the column data stay together), the data in HBase for a particular row stay together and the column data is spread and not together.  Hbase table writer (HBaseTableWriter::AppendRows() in hbase_table-writer.cc) create one "Put" object for each row, and save a batch of "Put" object in one Java ArrayLIst, then write an array of rows into HFile in one function call. We can change the code not to add the "Put" to ArrayList after creating it. Instead, add the "Put" to ArrayList after all columns are successfully added to "Put".

But Parquet is real column oriented database file format. Parquet Table Writer code (HdfsParquetTableWriter in hdfs-parquet-table-writer.cc) create a BaseColumnWriter object for each column. Each column writer object has its data page to buffer column data across rows. If the current data page of BaseColumnWriter is full, it will be flushed (finalized by calling FinalizeCurrentPage()).  It's really complicated (maybe not feasible) to rewind a column writer after its data page has been flushed. Since data pages for different column writer objects are flushed independently,  it's hard to rewind table writer to skip a partially wrote row.  

Based on above investigation, we will not support row skipping. We will add new query option "use_null_for_decimal_errors" as mentioned in Aman's comments.

 


was (Author: wzhou):
Since we write column by column when writing a row, we have to rewind the table writer for partially wrote row if we want to skip a row with invalid column data.

Read the code of the table writers for different formats to confirm if we can rewind the writer for partially wrote row. It seems that it's not hard for Kudu, Text and HBase formats since they buffer row data before writing a row to the table. But it's really hard to find a good way for Parquet to skip a partially wrote row.

Kudu (KuduTableSink::Send() in kudu-table-sink.cc) create KuduWriteOperation object for each row and push the object into a vector after adding all columns. We could change the code not to push the KuduWriteOperation object to the vector if there is invalid data for one column.

The text table writer (HdfsTextTableWriter::AppendRows() in hdfs-text-table-writter.cc) use stringstream to buffer the row data. The stringstream itself could not be re-winded, but we could save ending offset of last row, and flush the stringstream up to the ending offset of last row if we get an invalid column value when writing a new row. Then reset stringstream for next row.

Although HBase is known to be a column oriented database (where the column data stay together), the data in HBase for a particular row stay together and the column data is spread and not together.  Hbase table writer HBaseTableWriter::AppendRows() in hbase_table-writer.cc) create one "Put" object for each row, and save a batch of "Put" object in one Java ArrayLIst, then write an array of rows into HFile in one function call. We can change the code not to add the "Put" to ArrayList after creating it. Instead, add the "Put" to ArrayList after all columns are successfully added to "Put".

But Parquet is real column oriented database. Parquet Table Writer code (HdfsParquetTableWriter in hdfs-parquet-table-writer.cc) create a BaseColumnWriter object for each column. Each BaseColumnWriter object has its data page to buffer column data. If the current data page of BaseColumnWriter is full, it will be flushed (finalized by calling FinalizeCurrentPage()).  It's really complicated to rewind a BaseColumnWriter after its data page has been flushed. Since data pages for different BaseColumnWriter objects are flushed independently,  it's hard to rewind table writer to skip a partially wrote row.  

Based on above investigation, we will not support row skipping. We will add new query option "use_null_for_decimal_errors" as mentioned in Aman's comments.

 

> No error returned when inserting an overflowed value into a decimal column
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-10564
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10564
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend, Frontend
>    Affects Versions: Impala 4.0
>            Reporter: Wenzhe Zhou
>            Assignee: Wenzhe Zhou
>            Priority: Major
>
> When using CTAS statements or INSERT-SELECT statements to insert rows to table with decimal columns, Impala insert NULL for overflowed decimal values, instead of returning error. This issue happens when the data expression for the decimal column in SELECT sub-query consists at least one alias. This issue is similar as IMPALA-6340, but IMPALA-6340 only fixed the issue for the cases with the data expression for the decimal columns as constants so that the overflowed decimal values could be detected by frontend during expression analysis.  If there is alias (variables) in the data expression for the decimal column, Frontend could not evaluate data expression in expression analysis phase. Only backend could evaluate the data expression when backend execute fragment instances for SELECT sub-queries. The log messages showed that the executor detected the decimal overflow error, but somehow it did not propagate the error to the coordinator, hence the error was not returned to the client.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org