You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2023/02/07 22:53:00 UTC
[jira] [Updated] (HUDI-5243) Return num_affected_rows from sql INSERT statement
[ https://issues.apache.org/jira/browse/HUDI-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Guo updated HUDI-5243:
----------------------------
Fix Version/s: 0.14.0
(was: 0.13.0)
> Return num_affected_rows from sql INSERT statement
> --------------------------------------------------
>
> Key: HUDI-5243
> URL: https://issues.apache.org/jira/browse/HUDI-5243
> Project: Apache Hudi
> Issue Type: Improvement
> Components: spark-sql
> Reporter: kazdy
> Assignee: kazdy
> Priority: Minor
> Fix For: 0.14.0
>
>
> Currently when running spark sql DML, in order to check how many rows were affected, users need to get to the commit stats using hudi cli or stored procedure.
> We can improve user experience by returning num_affected_rows after INSERT INTO command, so that spark sql users can easily see how many rows were inserted without the need to go to the commits itself.
> num_affected_rows can be extracted in writer itself form commitMetadata
> Example:
> {code:java}
> spark.sql("""
> create table test_mor (id int, name string)
> using hudi
> tblproperties (primaryKey = 'id', type='mor');
> """)
> spark.sql(
> """
> INSERT INTO test_mor
> VALUES
> (1, "a"),
> (2, "b"),
> (3, "c"),
> (4, "d"),
> (5, "e"),
> (6, "f"),
> (7, "g")
> """).show()
> returns:
> +-----------------+
> |num_affected_rows|
> +-----------------+
> | 7|
> +-----------------+
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)