You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "lvyanquan (Jira)" <ji...@apache.org> on 2023/02/24 09:41:00 UTC

[jira] [Updated] (HUDI-5846) throw ClassCastException while run_bootstrap

     [ https://issues.apache.org/jira/browse/HUDI-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

lvyanquan updated HUDI-5846:
----------------------------
    Description: 
throw ClassCastException  while run_bootstrap for MERGE_ON_READ table type
h4. *version*
Spark 3.3
hudi 0.13

*error message*
{code:java}
Caused by: java.lang.ClassCastException: java.util.HashMap cannot be cast to org.apache.hudi.table.BulkInsertPartitioner
        at org.apache.hudi.table.action.commit.SparkBulkInsertHelper.bulkInsert(SparkBulkInsertHelper.java:77)
        at org.apache.hudi.table.action.deltacommit.SparkBulkInsertDeltaCommitActionExecutor.execute(SparkBulkInsertDeltaCommitActionExecutor.java:60)

{code}
 

*how to reproduce*
Spark SQL
{code:java}
create table hive_table (
  id int,
  ts int
) stored as parquet;

insert into hive_table values (1, 1);

create table hudi_mor (
  id int,
  ts int
) using hudi
tblproperties (
  type = 'mor',
  primaryKey = 'id',
  preCombineField = 'ts'
);

call run_bootstrap(table => 'hudi_mor', table_type => 'MERGE_ON_READ', bootstrap_path => 'hdfs://ns1/dtInsight/hive/warehouse/kunni.db/hive_table', base_path => 'hdfs://ns1/dtInsight/hive/warehouse/kunni.db/hudi_mor', rowKey_field => 'id', key_generator_class => 'org.apache.hudi.keygen.NonpartitionedKeyGenerator', bootstrap_overwrite => true, selector_class=> 'org.apache.hudi.client.bootstrap.selector.FullRecordBootstrapModeSelector'); {code}
 

*cause*

org.apache.hudi.table.action.bootstrap.SparkBootstrapDeltaCommitActionExecutor#getBulkInsertActionExecutor method used wrong constructor of SparkBulkInsertDeltaCommitActionExecutor.

  was:
throw ClassCastException  while run_bootstrap for MERGE_ON_READ table type

### version
Spark 3.3
hudi 0.13

### error message
```
Caused by: java.lang.ClassCastException: java.util.HashMap cannot be cast to org.apache.hudi.table.BulkInsertPartitioner
        at org.apache.hudi.table.action.commit.SparkBulkInsertHelper.bulkInsert(SparkBulkInsertHelper.java:77)
        at org.apache.hudi.table.action.deltacommit.SparkBulkInsertDeltaCommitActionExecutor.execute(SparkBulkInsertDeltaCommitActionExecutor.java:60)
```
### how to reproduce
Spark SQL
```
create table hive_table (
  id int,
  ts int
) stored as parquet;

insert into hive_table values (1, 1);

create table hudi_mor (
  id int,
  ts int
) using hudi
tblproperties (
  type = 'mor',
  primaryKey = 'id',
  preCombineField = 'ts'
);

call run_bootstrap(table => 'hudi_mor', table_type => 'MERGE_ON_READ', bootstrap_path => 'hdfs://ns1/dtInsight/hive/warehouse/kunni.db/hive_table', base_path => 'hdfs://ns1/dtInsight/hive/warehouse/kunni.db/hudi_mor', rowKey_field => 'id', key_generator_class => 'org.apache.hudi.keygen.NonpartitionedKeyGenerator', bootstrap_overwrite => true, selector_class=> 'org.apache.hudi.client.bootstrap.selector.FullRecordBootstrapModeSelector');
```

### cause
org.apache.hudi.table.action.bootstrap.SparkBootstrapDeltaCommitActionExecutor#
getBulkInsertActionExecutor method using wrong constructor of SparkBulkInsertDeltaCommitActionExecutor.


> throw ClassCastException  while run_bootstrap 
> ----------------------------------------------
>
>                 Key: HUDI-5846
>                 URL: https://issues.apache.org/jira/browse/HUDI-5846
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: bootstrap, spark
>            Reporter: lvyanquan
>            Priority: Trivial
>
> throw ClassCastException  while run_bootstrap for MERGE_ON_READ table type
> h4. *version*
> Spark 3.3
> hudi 0.13
> *error message*
> {code:java}
> Caused by: java.lang.ClassCastException: java.util.HashMap cannot be cast to org.apache.hudi.table.BulkInsertPartitioner
>         at org.apache.hudi.table.action.commit.SparkBulkInsertHelper.bulkInsert(SparkBulkInsertHelper.java:77)
>         at org.apache.hudi.table.action.deltacommit.SparkBulkInsertDeltaCommitActionExecutor.execute(SparkBulkInsertDeltaCommitActionExecutor.java:60)
> {code}
>  
> *how to reproduce*
> Spark SQL
> {code:java}
> create table hive_table (
>   id int,
>   ts int
> ) stored as parquet;
> insert into hive_table values (1, 1);
> create table hudi_mor (
>   id int,
>   ts int
> ) using hudi
> tblproperties (
>   type = 'mor',
>   primaryKey = 'id',
>   preCombineField = 'ts'
> );
> call run_bootstrap(table => 'hudi_mor', table_type => 'MERGE_ON_READ', bootstrap_path => 'hdfs://ns1/dtInsight/hive/warehouse/kunni.db/hive_table', base_path => 'hdfs://ns1/dtInsight/hive/warehouse/kunni.db/hudi_mor', rowKey_field => 'id', key_generator_class => 'org.apache.hudi.keygen.NonpartitionedKeyGenerator', bootstrap_overwrite => true, selector_class=> 'org.apache.hudi.client.bootstrap.selector.FullRecordBootstrapModeSelector'); {code}
>  
> *cause*
> org.apache.hudi.table.action.bootstrap.SparkBootstrapDeltaCommitActionExecutor#getBulkInsertActionExecutor method used wrong constructor of SparkBulkInsertDeltaCommitActionExecutor.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)