You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/10/22 06:16:19 UTC

[GitHub] [hudi] mutoulbj opened a new issue #3845: [SUPPORT]`if not exists` doesn't work on create table in spark-sql

mutoulbj opened a new issue #3845:
URL: https://github.com/apache/hudi/issues/3845


   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)?
   
   - Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an [issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   when use `create table if not exists` to create hudi table, get error when table exists.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. run `create table if not exists ttt (id bigint, name string) using hudi options (type='cow', primaryKey='id');`, succeed create the ttt table;
   2.rerun it,get error: `Error in query: Specified schema in create table statement is not equal to the table schema.You should not specify the schema for an exist table: `dev`.`ttt` ;`
   
   **Expected behavior**
   
   does not raise exception when table exists.
   
   **Environment Description**
   
   * Hudi version :0.9.0
   
   * Spark version :3.0.3
   
   * Hive version :3.1.2
   
   * Hadoop version :3.2.2
   
   * Storage (HDFS/S3/GCS..) :HDFS
   
   * Running on Docker? (yes/no) :no
   
   
   **Additional context**
   
   it seems hudi can not ignore the hudi meta columns.
   
   **Stacktrace**
   
   ```> create table if not exists ttt (id bigint, name string) using hudi options (type='cow', primaryKey='id');
   1211789 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1211789 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1211792 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1211792 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1211804 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1211804 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1211807 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1211807 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1211821 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_table : db=dev tbl=ttt
   1211821 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_table : db=dev tbl=ttt
   1211823 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1211823 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1211826 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1211826 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1211857 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1211857 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1211859 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_table : db=dev tbl=ttt
   1211859 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_table : db=dev tbl=ttt
   1211881 [main] INFO  org.apache.spark.sql.hive.HiveUtils  - Initializing HiveMetastoreConnection version 2.3.7 using Spark classes.
   1211881 [main] INFO  org.apache.spark.sql.hive.client.HiveClientImpl  - Warehouse location for Hive client (version 2.3.7) is /user/hive/warehouse
   1211902 [main] INFO  org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAccessController  - Created SQLStdHiveAccessController for session context : HiveAuthzSessionContext [sessionString=1e7f3c52-32fc-48fc-990c-76c55678273d, clientType=HIVECLI]
   1211903 [main] WARN  org.apache.hadoop.hive.ql.session.SessionState  - METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory.
   1211903 [main] INFO  hive.metastore  - Mestastore configuration hive.metastore.filter.hook changed from org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
   1211904 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: Cleaning up thread local RawStore...
   1211904 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=Cleaning up thread local RawStore...
   1211904 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: Done cleaning up thread local RawStore
   1211905 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=Done cleaning up thread local RawStore
   1211929 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: create_table: Table(tableName:ttt, dbName:dev, owner:root, createTime:1634883038, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:_hoodie_commit_time, type:string, comment:null), FieldSchema(name:_hoodie_commit_seqno, type:string, comment:null), FieldSchema(name:_hoodie_record_key, type:string, comment:null), FieldSchema(name:_hoodie_partition_path, type:string, comment:null), FieldSchema(name:_hoodie_file_name, type:string, comment:null), FieldSchema(name:id, type:bigint, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt, inputFormat:org.apache.hudi.hadoop.HoodieParquetInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe, pa
 rameters:{path=hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt, serialization.format=1, type=cow, primaryKey=id}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{spark.sql.sources.schema.part.0={"type":"struct","fields":[{"name":"_hoodie_commit_time","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_commit_seqno","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_record_key","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_partition_path","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_file_name","type":"string","nullable":true,"metadata":{}},{"name":"id","type":"long","nullable":true,"metadata":{}},{"name":"name","type":"string","nullable":true,"metadata":{}}]}, spark.sql.sources.schema.numParts=1, spark.sql.sources.provider=hudi, spark.sql.create.version=3.0.3}, viewOriginalText:null, viewExpanded
 Text:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{root=[PrivilegeGrantInfo(privilege:INSERT, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:SELECT, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:UPDATE, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:DELETE, createTime:-1, grantor:root, grantorType:USER, grantOption:true)]}, groupPrivileges:null, rolePrivileges:null))
   1211929 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=create_table: Table(tableName:ttt, dbName:dev, owner:root, createTime:1634883038, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:_hoodie_commit_time, type:string, comment:null), FieldSchema(name:_hoodie_commit_seqno, type:string, comment:null), FieldSchema(name:_hoodie_record_key, type:string, comment:null), FieldSchema(name:_hoodie_partition_path, type:string, comment:null), FieldSchema(name:_hoodie_file_name, type:string, comment:null), FieldSchema(name:id, type:bigint, comment:null), FieldSchema(name:name, type:string, comment:null)], location:hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt, inputFormat:org.apache.hudi.hadoop.HoodieParquetInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hiv
 e.ql.io.parquet.serde.ParquetHiveSerDe, parameters:{path=hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt, serialization.format=1, type=cow, primaryKey=id}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{spark.sql.sources.schema.part.0={"type":"struct","fields":[{"name":"_hoodie_commit_time","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_commit_seqno","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_record_key","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_partition_path","type":"string","nullable":true,"metadata":{}},{"name":"_hoodie_file_name","type":"string","nullable":true,"metadata":{}},{"name":"id","type":"long","nullable":true,"metadata":{}},{"name":"name","type":"string","nullable":true,"metadata":{}}]}, spark.sql.sources.schema.numParts=1, spark.sql.sources.provider=hudi, spark.sql.create.version=3
 .0.3}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:{root=[PrivilegeGrantInfo(privilege:INSERT, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:SELECT, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:UPDATE, createTime:-1, grantor:root, grantorType:USER, grantOption:true), PrivilegeGrantInfo(privilege:DELETE, createTime:-1, grantor:root, grantorType:USER, grantOption:true)]}, groupPrivileges:null, rolePrivileges:null))
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.wm.default.pool.size does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.task.scheduler.preempt.independent does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.output.format.arrow does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.tez.llap.min.reducer.per.executor does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.arrow.root.allocator.limit does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.use.checked.expressions does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.tez.dynamic.semijoin.reduction.for.mapjoin does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.complex.types.enabled does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.wm.worker.threads does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.repl.partitions.dump.parallelism does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.uri.selection does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.strict.checks.no.partition.filter does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.tez.dynamic.semijoin.reduction.for.dpp.factor does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.stats.filter.in.min.ratio does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.client.cache.initial.capacity does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.stats.ndv.estimate.percent does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.webui.cors.allowed.methods does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.optimize.joinreducededuplication does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.client.cache.enabled does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.stats.fetch.bitvector does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.disable.unsafe.external.table.operations does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.materializedview.rewriting.incremental does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.materializedviews.registry.impl does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.event.db.notification.api.auth does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.exec.orc.delta.streaming.optimizations.enabled does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.stats.ndv.algo does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.spark.job.max.tasks does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.msck.repair.batch.max.retries does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.prewarm.spark.timeout does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.optimize.update.table.properties.from.serde.list does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.plugin.client.num.threads does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.test.bucketcodec.version does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.reexecution.enabled does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.materializedview.rewriting.time.window does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.reexecution.stats.cache.batch.size does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.webui.cors.allowed.headers does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.join.inner.residual does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.active.passive.ha.enable does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.trace.always.dump does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.reexecution.stats.persist.scope does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.mm.allow.originals does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.internal.ss.authz.settings.applied.marker does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.compactor.compact.insert.only does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.txn.xlock.iow does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.spark.rsc.conf.list does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.stats.jdbc.timeout does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.cache.defaultfs.only.native.fileid does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.spark.optimize.shuffle.serde does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.testing.remove.logs does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.distcp.privileged.doAs does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.strict.checks.orderby.no.limit does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.client.cache.expiry.time does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.allocator.defrag.headroom does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.notification.event.consumers does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.input.format.supports.enabled does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.client.cache.max.capacity does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.repl.dumpdir.clean.freq does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.spark.use.ts.stats.for.mapjoin does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.repl.dump.include.acid.tables does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.webui.use.pam does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.reexecution.max.count does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.share.object.pools does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.optimize.update.table.properties.from.serde does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.service.metrics.codahale.reporter.classes does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.tez.session.events.print.summary does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.vrb.queue.limit.base does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.mm.avoid.s3.globstatus does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.repl.replica.functions.root.dir does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.results.cache.max.entry.lifetime does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.limit.connections.per.user does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.thrift.http.compression.enabled does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.execution.ptf.enabled does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.optimize.shared.work.extended does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.row.identifier.enabled does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.reexecution.always.collect.operator.stats does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.repl.dumpdir.ttl does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.local.time.zone does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.tez.wm.am.registry.timeout does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.active.passive.ha.registry.namespace does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.create.as.insert.only does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.mapjoin.memory.oversubscribe.factor does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.arrow.batch.size does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.notification.sequence.lock.retry.sleep.interval does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.repl.approx.max.load.tasks does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.results.cache.enabled does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.legacy.schema.for.all.serdes does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.tez.dag.status.check.interval does not exist
   1211954 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.druid.bitmap.type does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.spark.dynamic.partition.pruning.map.join.only does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.memory.oversubscription.max.executors.per.query does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.trace.size does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.plugin.rpc.num.handlers does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.wm.allow.any.pool.via.jdbc does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.groupby.complex.types.enabled does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.avro.timestamp.skip.conversion does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.results.cache.nontransactional.tables.enabled does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.stats.correlated.multi.key.joins does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.db.type does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.streaming.auto.flush.check.interval.size does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.zookeeper.connection.timeout does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.reexecution.strategies does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.limit.connections.per.user.ipaddress does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.mapjoin.memory.monitor.check.interval does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.optimize.shared.work does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.stats.estimate does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.allocator.discard.method does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.tez.cartesian-product.enabled does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.notification.sequence.lock.max.retries does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.heap.memory.monitor.usage.threshold does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.privilege.synchronizer.interval does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.adaptor.suppress.evaluate.exceptions does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.materializedview.rebuild.incremental does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.results.cache.max.entry.size does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.spark.stage.max.tasks does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.testing.short.logs does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.streaming.auto.flush.enabled does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.spark.explain.user does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.operation.log.cleanup.delay does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.repl.dump.metadata.only does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.optimize.countdistinct does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.auto.convert.join.shuffle.max.size does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.plugin.acl does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.schema.info.class does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.tez.queue.access.check does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.external.splits.temp.table.storage.format does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.row.wrapper.enabled does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.constraint.notnull.enforce does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.cli.print.escape.crlf does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.trigger.validation.interval does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.webui.cors.allowed.origins does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.limit.connections.per.ipaddress does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.external.splits.order.by.force.single.split does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.metastore.client.cache.stats.enabled does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.notification.event.poll.interval does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.transactional.concatenate.noblock does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.materializedview.rewriting.strategy does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.if.expr.mode does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.exim.test.mode does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.results.cache.directory does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.results.cache.wait.for.pending.results does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.remove.orderby.in.subquery does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.tez.bmj.use.subcache does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.vrb.queue.limit.min does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.wm.pool.metrics does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.repl.add.raw.reserved.namespace does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.resource.use.hdfs.location does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.stats.num.nulls.estimate.percent does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.acid does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.zk.sm.session.timeout does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.ptf.max.memory.buffering.batch.count does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.task.scheduler.am.registry does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.druid.overlord.address.default does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.optimize.remove.sq_count_check does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.server2.webui.enable.cors does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.stats.retries.wait does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.vectorized.row.serde.inputformat.excludes does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.reexecution.stats.cache.size does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.combine.equivalent.work.optimization does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.lock.query.string.max.length does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.llap.io.track.cache.usage does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.use.orc.codec.pool does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.query.results.cache.max.size does not exist
   1211955 [main] WARN  org.apache.hadoop.hive.conf.HiveConf  - HiveConf of name hive.repl.bootstrap.dump.open.txn.timeout does not exist
   1211955 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
   1211956 [main] INFO  org.apache.hadoop.hive.metastore.ObjectStore  - ObjectStore, initialize called
   1211961 [main] INFO  org.apache.hadoop.hive.metastore.MetaStoreDirectSql  - Using direct SQL, underlying DB is MYSQL
   1211961 [main] INFO  org.apache.hadoop.hive.metastore.ObjectStore  - Initialized ObjectStore
   1211964 [main] WARN  org.apache.hadoop.hive.metastore.HiveMetaStore  - Location: hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt specified for non-external table:ttt
   1211965 [main] INFO  org.apache.hadoop.hive.common.FileUtils  - Creating directory if it doesn't exist: hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
   1212097 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1212097 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1212099 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1212099 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1212103 [main] INFO  org.apache.spark.sql.hudi.command.CreateHoodieTableCommand  - Init hoodie.properties for ttt
   1212116 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Initializing hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt as hoodie table hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
   1212271 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Loading HoodieTableMetaClient from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
   1212272 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig  - Loading table properties from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt/.hoodie/hoodie.properties
   1212275 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
   1212275 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished initializing Table of type COPY_ON_WRITE from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
   Time taken: 0.499 seconds
   1212281 [main] INFO  org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver  - Time taken: 0.499 seconds
   spark-sql> 
            > 
            > create table if not exists ttt (id bigint, name string) using hudi options (type='cow', primaryKey='id');
   1272842 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1272842 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1272845 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore  - 0: get_database: dev
   1272845 [main] INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit  - ugi=root   ip=unknown-ip-addr      cmd=get_database: dev
   1272850 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Loading HoodieTableMetaClient from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
   1272852 [main] INFO  org.apache.hudi.common.table.HoodieTableConfig  - Loading table properties from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt/.hoodie/hoodie.properties
   1272855 [main] INFO  org.apache.hudi.common.table.HoodieTableMetaClient  - Finished Loading Table of type COPY_ON_WRITE(version=1, baseFileFormat=PARQUET) from hdfs://192.168.42.75:9000/user/hive/warehouse/dev.db/ttt
   1272856 [main] INFO  org.apache.hudi.common.table.timeline.HoodieActiveTimeline  - Loaded instants upto : Optional.empty
   Error in query: Specified schema in create table statement is not equal to the table schema.You should not specify the schema for an exist table: `dev`.`ttt` ;
   spark-sql> ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] BenjMaq commented on issue #3845: [SUPPORT]`if not exists` doesn't work on create table in spark-sql

Posted by GitBox <gi...@apache.org>.
BenjMaq commented on issue #3845:
URL: https://github.com/apache/hudi/issues/3845#issuecomment-949640297


   I am facing the same issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan commented on issue #3845: [SUPPORT]`if not exists` doesn't work on create table in spark-sql

Posted by GitBox <gi...@apache.org>.
xushiyan commented on issue #3845:
URL: https://github.com/apache/hudi/issues/3845#issuecomment-950442738


   @mutoulbj @BenjMaq Thanks for raising this! It does make sense to print a message indicating table exists instead of errorring. Filing a JIRA and please feel free to take it if you're interested!
   
   https://issues.apache.org/jira/browse/HUDI-2611


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] xushiyan closed issue #3845: [SUPPORT]`if not exists` doesn't work on create table in spark-sql

Posted by GitBox <gi...@apache.org>.
xushiyan closed issue #3845:
URL: https://github.com/apache/hudi/issues/3845


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org