You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yuan Zhu (Jira)" <ji...@apache.org> on 2022/01/05 09:25:00 UTC

[jira] [Created] (FLINK-25529) java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table

Yuan Zhu created FLINK-25529:
--------------------------------

             Summary: java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter when write bulkly into hive-2.1.1 orc table
                 Key: FLINK-25529
                 URL: https://issues.apache.org/jira/browse/FLINK-25529
             Project: Flink
          Issue Type: Bug
          Components: Connectors / Hive
         Environment: hive 2.1.1

flink 1.12.4
            Reporter: Yuan Zhu
         Attachments: lib.jpg

I tried to write data bulkly into hive-2.1.1 with orc format, and encountered java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter

 

Using bulk writer by setting table.exec.hive.fallback-mapred-writer = false;

 
{code:java}
SET 'table.sql-dialect'='hive';
create table orders(
    order_id int,
    order_date timestamp,
    customer_name string,
    price decimal(10,3),
    product_id int,
    order_status boolean
)partitioned by (dt string)
stored as orc;
 
SET 'table.sql-dialect'='default';

create table datagen_source (
order_id int,
order_date timestamp(9),
customer_name varchar,
price decimal(10,3),
product_id int,
order_status boolean
)with('connector' = 'datagen');

create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/mnt/conf');
set table.exec.hive.fallback-mapred-writer = false;

insert into myhive.`default`.orders
/*+ OPTIONS(
    'sink.partition-commit.trigger'='process-time',
    'sink.partition-commit.policy.kind'='metastore,success-file',
    'sink.rolling-policy.file-size'='128MB',
    'sink.rolling-policy.rollover-interval'='10s',
    'sink.rolling-policy.check-interval'='10s',
    'auto-compaction'='true',
    'compaction.file-size'='1MB'    ) */
select * , date_format(now(),'yyyy-MM-dd') as dt from datagen_source;  {code}
[ERROR] Could not execute SQL statement. Reason:
java.lang.ClassNotFoundException: org.apache.orc.PhysicalWriter

 

My jars in lib dir listed in attachment.

In HiveTableSink#createStreamSink(line:270), createBulkWriterFactory if table.exec.hive.fallback-mapred-writer is false.

If table is orc, HiveShimV200#createOrcBulkWriterFactory will be invoked. 

OrcBulkWriterFactory depends on org.apache.orc.PhysicalWriter in orc-core, but flink-connector-hive excludes orc-core for conflicting with hive-exec.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)