You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2023/02/07 22:53:00 UTC

[jira] [Updated] (HUDI-5243) Return num_affected_rows from sql INSERT statement

     [ https://issues.apache.org/jira/browse/HUDI-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Guo updated HUDI-5243:
----------------------------
    Fix Version/s: 0.14.0
                       (was: 0.13.0)

> Return num_affected_rows from sql INSERT statement
> --------------------------------------------------
>
>                 Key: HUDI-5243
>                 URL: https://issues.apache.org/jira/browse/HUDI-5243
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: spark-sql
>            Reporter: kazdy
>            Assignee: kazdy
>            Priority: Minor
>             Fix For: 0.14.0
>
>
> Currently when running spark sql DML, in order to check how many rows were affected, users need to get to the commit stats using hudi cli or stored procedure.
> We can improve user experience by returning num_affected_rows after INSERT INTO command, so that spark sql users can easily see how many rows were inserted without the need to go to the commits itself.
> num_affected_rows can be extracted in writer itself form commitMetadata
> Example:
> {code:java}
> spark.sql("""
> create table test_mor (id int, name string) 
> using hudi 
> tblproperties (primaryKey = 'id', type='mor');
> """)
> spark.sql(
> """
> INSERT INTO test_mor
> VALUES 
> (1, "a"),
> (2, "b"),
> (3, "c"),
> (4, "d"),
> (5, "e"),
> (6, "f"),
> (7, "g")
> """).show()
> returns:
> +-----------------+
> |num_affected_rows|
> +-----------------+
> |                7|
> +-----------------+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)